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"MANIPULATION OF CELLULOSE AND/OR P-M-GLUCAN" 

The present invention relates generally to isolated genes which encode polypeptides involved 
in cellulose biosynthesis and transgenic organisms expressing same in sense or antisense 
5 orientation, or as ribozymes, co-suppression or gene-targeting molecules. More particularly, 
the present invention is directed to a nucleic acid molecule isolated from Arabidopsis 
thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and 
Eucalyptus ssp. which encode an enzyme which is important in cellulose biosynthesis, in 
particular the cellulose synthase enzyme and homologues, analogues and derivatives thereof 
10 and uses of same in the production of transgenic plants expressing altered cellulose 
biosynthetic properties. 

Bibliographic details of the publications referred to by author in this specification are 
collected at the end of the description. Sequence identity numbers (SEQ ID Nos.) for the 
1 5 nucleotide and amino acid sequences referred to in the specification are defined after the 
bibliography. 

Throughout the specification, unless the context requires otherwise, the word "comprise", or 
variations such as "comprises" or "comprising" will be understood to imply the inclusion of 
20 a stated element or integer or group of elements or integers but not the exclusion of any other 
element or integer or group of elements or integers. 

Cellulose, the world's most abundant biopolymer, is the most characteristic component of 
plant cell walls in so far as it forms much of the structural framework of the cell wall. 

25 Cellulose is comprised of crystalline P-l ,4-glucan microfibrils. The crystalline microfibrils 
are extremely strong and resist enzymic and mechanical degradation, an important factor in 
determining the nutritional quantity, digestibility and palatability of animal and human 
foodstuffs. As cellulose is also the dominant structural component of industrially-important 
plant fibres, such as cotton, flax, hemp, jute and the timber crops such as Eucalyptus ssp. and 

30 Pinus ssp. , amongst others, there is considerable economic benefit to be derived from the 
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manipulation of cellulose content and/or quantity in plants. In particular, the production of 
food and fibre crops with altered cellulose content are highly desirable objectives. 

The synthesis of cellulose involves the (J-l,4-linkage of glucose monomers, in the form of a 
5 nucleoside diphospoglucose such as UDP-glucose, to a pre-existing cellulose chain, catalysed 
by the enzyme cellulose synthase. 

Several attempts to identify the components of the functional cellulose synthase in plants have 
failed, because levels of p-l,4-glucan or crystalline cellulose produced in such assays have 
10 hitherto been too low to permit enzyme purification for protein sequence determination. 
Insufficient homology between bacterial P-l,4-glucan synthase genes and plant cellulose 
synthase genes has also prevented the use of hybridisation as an approach to isolating the 
plant homologues of bacterial p-l,4-glucan (cellulose) synthases. 

15 Furthermore, it has not been possible to demonstrate that the cellulose synthase enzyme from 
plants is the same as, or functionally related to, other purified and characterised enzymes 
involved in polysaccharide biosynthesis. As a consequence, the cellulose synthase enzyme 
has not been isolated from plants and, until the present invention, no nucleic acid molecule 
has been characterised which functionally-encodes a plant cellulose synthase enzyme. 

20 

In work leading up to the present invention, the inventors have generated several novel 
mutant Arabidopsis thaliana plants which are defective in cellulose biosynthesis. The 
inventors have further isolated a cellulose synthase gene designated RSWl , which is involved 
in cellulose biosynthesis in Arabidopsis thaliana^ and homologous sequences in Oryza sativa, 
25 wheat, barley, maize, Brassica ssp. , Gossypiwn hirsutum and Eucalyptus ssp. The isolated 
nucleic acid molecules of the present invention provide the means by which cellulose content 
and structure may be modified in plants to produce a range of useful fibres suitable for 
specific industrial purposes, for example increased decay resistance of timber and altered 
digestibility of foodstuffs, amongst others. 

30 
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Accordingly, one aspect of the present invention provides an isolated nucleic acid molecule 
comprising a sequence of nucleotides which encodes, or is complementary to a sequence 
which encodes a polypeptide of the cellulose biosynthetic pathway or a functional homologue, 
analogue or derivative thereof. 

5 

The nucleic acid molecule of the invention may be derived from a prokaryotic source or a 
eukaryotic source. 

Those skilled in the art will be aware that cellulose production requires not only the presence 
10 of a catalytic subunit, but also its activation and organisation into arrays which favour the 
crystallization of glucan chains. This organisation is radically different between bacteria, 
which possess linear arrays, and higher plants, which possess hexameric clusters or 
"rosettes", of glucan chains. The correct organisation and activation of the bacterial enzyme 
may require many factors which are either not known, or alternatively, not known to be 
15 present in plant cells, for example specific membrane lipids to impart an active conformation 
on the enzyme complex or protein, or the bacterial c-di-GMP activation system. 
Accordingly, the use of a plant-derived sequence in eukaryotic cells such as plants provides 
significant advantages compared to the use of bacterial ly-derived sequences. 

20 Accordingly, the present invention does not extend to known genes encoding the catalytic 
subunit of Agrobacterium tumefaciens or Acetobacter xylinum or Acetobacter pasteurianus 
cellulose synthase, or the use of such known bacterial genes and polypeptides to manipulate 
cellulose. 

25 Preferably, the subject nucleic acid molecule is derived from an eukaryotic organism. 

In a more preferred embodiment of the invention, the isolated nucleic acid molecule of the 
invention encodes a plant cellulose synthase or a catalytic subunit thereof, or a homologue, 
analogue or derivative thereof. 



30 
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More preferably, the isolated nucleic acid molecule encodes a plant cellulose synthase 
polypeptide which is associated with the primary cell wall of a plant cell. In an alternative 
preferred embodiment, the nucleic acid molecule of the invention encodes a plant cellulose 
synthase or catalytic subunit thereof which is normally associated with the secondary cell wall 
5 of a plant cell. 

In a more preferred embodiment, the nucleic acid molecule of the invention is a cDNA 
molecule, genomic clone, mRNA molecule or a synthetic oligonucleotide molecule. 

10 In a particularly preferred embodiment, the present invention provides an isolated nucleic acid 
molecule which encodes or is complementary to a nucleic acid molecule which encodes the 
Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp,, 
Brassica ssp. wheat, barley or maize cellulose synthase enzyme or a catalytic subunit thereof 
or a polypeptide component, homologue, analogue or derivative thereof. 

15 

As exemplified herein, the present inventors have identified cellulose biosynthesis genes in 
maize, wheat, barley, rice, cotton, Brassica ssp. and Eucalyptus ssp., in addition to the 
specific Arabidopsis thaliana RSWl gene sequence which has been shown to be particularly 
useful for altering cellulose and/or P-l,4-glucan and/or starch levels in cells. 

20 

Hereinafter the term "polypeptide of the cellulose biosynthetic pathway" or similar term shall 
be taken to refer to a polypeptide or a protein or a part, homologue, analogue or derivative 
thereof which is involved in one or more of the biosynthetic steps leading to the production 
of cellulose or any related [J-l,4-glucan polymer in plants. In the present context, a 
25 polypeptide of the cellulose biosynthetic pathway shall also be taken to include both an active 
enzyme which contributes to the biosynthesis of cellulose or any related P-l,4-glucan polymer 
in plants and to a polypeptide component of such an enzyme. As used herein, a polypeptide 
of the cellulose biosynthetic pathway thus includes cellulose synthase. Those skilled in the art 
will be aware of other cellulose biosynthetic pathway polypeptides in plants. 
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The term "related p-l,4-glucan polymer" shall be taken to include any carbohydrate molecule 
comprised of a primary structure of P-l,4-linked glucose monomers similar to the structure 
of the components of the cellulose microfibril 9 wherein the relative arrangement or relative 
configuration of the glucan chains may differ from their relative configuration in microfibrils 
5 of cellulose. As used herein, a related p-l,4-glucan polymer includes those p-l,4-glucan 
polymers wherein individual p-l,4-glucan microfibrils are arranged in an anti-parallel or 
some other relative configuration not found in a cellulose molecule of plants and those non- 
crystalline P-l,4-glucans described as lacking the resistance to extraction and degradation that 
characterise cellulose microfibrils. 

10 

The term "cellulose synthase" shall be taken to refer to a polypeptide which is required to 
catalyse a P-l,4-glucan linkage to a cellulose microfibril. 

Reference herein to "gene" is to be taken in its broadest context and includes: 
15 (i) a classical genomic gene consisting of transcriptional and/or translational 

regulatory sequences and/or a coding region and/or non-translated sequences (i.e. 
introns, 5'- and 3'- untranslated sequences); or 

(ii) mRNA or cDNA corresponding to the coding regions (i.e. exons) and 5'- and 
3'- untranslated sequences of the gene. 

20 

The term "gene" is also used to describe synthetic or fusion molecules encoding all or part 
of a functional product. 

In the present context, the term "cellulose gene" or "cellulose genetic sequence" or similar 
25 term shall be taken to refer to any gene as hereinbefore defined which encodes a polypeptide 
of the cellulose biosynthetic pathway and includes a cellulose synthase gene. 

The term "cellulose synthase gene" shall be taken to refer to any cellulose gene which 
specifically encodes a polypeptide which is a component of a functional enzyme having 
30 cellulose synthase activity i.e. an enzyme which catalyses a pi,4-glucan linkage to a 
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cellulose microfibril. 

Preferred cellulose genes may be derived from a naturally-occurring cellulose gene by 
standard recombinant techniques. Generally, a cellulose gene may be subjected to 
5 mutagenesis to produce single or multiple nucleotide substitutions, deletions and/or additions. 
Nucleotide insertional derivatives of the cellulose synthase gene of the present invention 
include 5' and 3' terminal fusions as well as intra-sequence insertions of single or multiple 
nucleotides. Insertional nucleotide sequence variants are those in which one or more 
nucleotides are introduced into a predetermined site in the nucleotide sequence although 

10 random insertion is also possible with suitable screening of the resulting product. Deletional 
variants are characterised by the removal of one or more nucleotides from the sequence. 
Substitutional nucleotide variants are those in which at least one nucleotide in the sequence 
has been removed and a different nucleotide inserted in its place. Such a substitution may be 
"silent" in that the substitution does not change the amino acid defined by the codon. 

15 Alternatively, substituents are designed to alter one amino acid for another similar acting 
amino acid, or amino acid of like charge, polarity, or hydrophobicity. 

As used herein, the term "derived from" shall be taken to indicate that a particular integer or 
group of integers has originated from the species specified, but has not necessarily been 
20 obtained directly from the specified source. 

For the present purpose, "homologies" of a nucleotide sequence shall be taken to refer to an 
isolated nucleic acid molecule which is substantially the same as the nucleic acid molecule of 
the present invention or its complementary nucleotide sequence, notwithstanding the 
25 occurrence within said sequence, of one or more nucleotide substitutions, insertions, 
deletions, or rearrangements. 

"Analogues" of a nucleotide sequence set forth herein shall be taken to refer to an isolated 
nucleic acid molecule which is substantially the same as a nucleic acid molecule of the present 
30 invention or its complementary nucleotide sequence, notwithstanding the occurrence of any 
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non-nucleotide constituents not normally present in said isolated nucleic acid molecule, for 
example carbohydrates, radiochemicals including radionucleotides, reporter molecules such 
as, but not limited to DIG, alkaline phosphatase or horseradish peroxidase, amongst others. 

5 "Derivatives" of a nucleotide sequence set forth herein shall be taken to refer to any isolated 
nucleic acid molecule which contains significant sequence similarity to said sequence or a pan 
thereof. Generally, the nucleotide sequence of the present invention may be subjected to 
mutagenesis to produce single or multiple nucleotide substitutions, deletions and/or insertions. 
Nucleotide insertional derivatives of the nucleotide sequence of the present invention include 

10 5' and 3' terminal fusions as well as intra-sequence insertions of single or multiple 
nucleotides or nucleotide analogues. Insertional nucleotide sequence variants are those in 
which one or more nucleotides or nucleotide analogues are introduced into a predetermined 
site in the nucleotide sequence of said sequence, although random insertion is also possible 
with suitable screening of the resulting product being performed. Deletional variants are 

15 characterised by the removal of one or more nucleotides from the nucleotide sequence. 
Substitutional nucleotide variants are those in which at least one nucleotide in the sequence 
has been removed and a different nucleotide or nucleotide analogue inserted in its place. 



The present invention extends to the isolated nucleic acid molecule when integrated into the 
20 genome of a cell as an addition to the endogenous cellular complement of cellulose synthase 
genes. The said integrated nucleic acid molecule may, or may not, contain promoter 
sequences to regulate expression of the subject genetic sequence. 

The isolated nucleic acid molecule of the present invention may be introduced into and 
25 expressed in any cell, for example a plant cell, fungal cell, insect cell, animal cell, yeast cell 
or bacterial cell. Those skilled in the art will be aware of any moficiations which are required 
to the codon usage or promoter sequences or other regulatory sequences, in order for 
expression to occur in such cells. 

30 Another aspect of the present invention is directed to a nucleic acid molecule which comprises 
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a sequence of nucleotides corresponding or complementary to any one or more of the 
sequences set forth in SEQ ID Nos:l, 3, 4, 5, 7, 9, 11, or 13, or having at least about 40%, 
more preferably at least about 55%, still more preferably at least about 65%, yet still more 
preferably at least about 75-80% and even still more preferably at least about 85-95% 
5 nucleotide similarity to all, or a pan thereof. 

According to this aspect of the invention, said nucleic acid molecule encodes, or is 
complementary to a nucleotide sequence encoding, a polypeptide of the cellulose biosynthetic 
pathway in a plant or a homologue. analogue or derivative thereof. 

10 

Preferably, a nucleic acid molecule which is at least 40% related to any one or more of the 
sequences set forth in SEQ ID Nos:l, 3, 4, 5, 7, 9, 11, or 13 comprises a nucleotide 
sequence which encodes or is complementary to a sequence which encodes a plant cellulose 
synthase, more preferably a cellulose synthase which is associated with the primary or the 
15 secondary plant cell wall of the species from which it has been derived. 

Furthermore, the nucleic acid molecule according to this aspect of the invention may be 
derived from a monocotyledonous or dicotyledonous plant species. In a particularly preferred 
embodiment, the nucleic acid molecule is derived from Arabidopsis thaliana, Oryza sativa, 
20 wheat, barley, maize, Brassica ssp. t Gossypium hirsutum (cotton) or Eucalyptus ssp., 
amongst others. 

For the purposes of nomenclature, the nucleotide sequence shown in SEQ ID NO:l relates 
to a cellulose gene as hereinbefore defined which comprises a cDNA sequence designated 
25 T20782 and which is derived from Arabidopsis thaliana. The amino acid sequence set forth 
in SEQ ID NO: 2 relates to the polypeptide encoded by T20782. 

The nucleotide sequence set forth in SEQ ID NO: 3 relates to the nucleotide sequence of the 
complete Arabidopsis thaliana genomic gene RSWl 9 including both intron and exon 
30 sequences. The nucleotide sequence of SEQ ID NO:3 comprises exons 1-14 of the genomic 
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gene and includes 2295bp of 5 '-untranslated sequences, of which approximately the first 
1.9kb comprises RSWl promoter sequence (there is a putative TATA box motif at positions 
1843-1850 of SEQ ID NO:3). The nucleotide sequence set forth in SEQ ID NO:3 is derived 
from the cosmid clone 23H12. This sequence is also the genomic gene equivalent of SEQ ID 
5 Nos: 1 and 5. 

The nucleotide sequence set forth in SEQ ID NO:4 relates to the partial nucleotide sequence 
of a genomic gene variant of RSWl, derived from cosmid clone 12C4. The nucleotide 
sequence of SEQ ID NO:4 comprises exon sequence 1-11 and part of exon 12 of the genomic 
10 gene sequence and includes 862bp of 5 '-untranslated sequences, of which approximately 700 
nucleotides comprise RSWl promoter sequences (there is a putative TATA box motif at 
positions 668-673 of SEQ ID NO: 4). The genomic gene sequence set forth in SEQ ID NO: 4 
is the equivalent of the cDNA sequence set forth in SEQ ID NO:7 (i.e. cDNA clone Ath-A). 

15 The nucleotide sequence shown in SEQ ID NO:5 relates to a cellulose gene as hereinbefore 
defined which comprises a cDNA equivalent of the Arabidopsis thaliana RSWl gene set forth 
in SEQ ID NO: 3. The amino acid sequence set forth in SEQ ID NO: 6 relates to the 
polypeptide encoded by the wild-type RSWl gene sequences set forth in SEQ ID Nos: 3 and 
5. 

20 

The nucleotide sequence shown in SEQ ID NO: 7 relates to a cellulose gene as hereinbefore 
defined which comprises a cDNA equivalent of the Arabidopsis thaliana RSWl gene set forth 
in SEQ ID NO:4. The nucleotide sequence is a variant of the nucleotide sequences set forth 
in SEQ ID Nos: 3 and 5. The amino acid sequence set forth in SEQ ID NO: 8 relates to the 
25 polypeptide encoded by the wild-type RSWl gene sequences set forth in SEQ ID Nos:4 and 
6. 

The nucleotide sequence shown in SEQ ID NO:9 relates to a cellulose gene as hereinbefore 
defined which comprises a further wild-type variant of the Arabidopsis thaliana RSWl gene 
30 set forth in SEQ ID Nos: 3 and 5. The nucleotide sequence variant is designated Ath-B. The 
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amino acid sequence set forth in SEQ ID NO: 10 relates to the polypeptide encoded by the 
wild-type RSWl gene sequence set forth in SEQ ID No:9. 

The nucleotide sequence shown in SEQ ID NO: 11 relates to a cellulose gene as hereinbefore 
5 defined which comprises a cDNA equivalent of the Arabidopsis thaliana rswl gene. The rswl 
gene is a mutant cellulose gene which produces a radial root swelling phenotype as described 
by Baskin ex al (1992). The present inventors have shown herein that the rswl gene also 
produces reduced inflorescence length, reduced fertility, misshapen epidermal cells, reduced 
cellulose content and the accumulation of non-crystalline P-l ,4-glucan, amongst others, when 
10 expressed in plant cells. The rswl nucleotide sequence is a further variant of the nucleotide 
sequences set forth in SEQ ID Nos:3 and 5. The amino acid sequence set forth in SEQ ID 
NO: 12 relates to the rswl polypeptide encoded by the mutant rswl gene sequence set forth 
in SEQ ID No: 11. 

1 5 The nucleotide sequence shown in SEQ ID NO: 13 relates to a cellulose gene as hereinbefore 
defined which comprises a cDNA equivalent of the Oryza sativa RSWl or /fSWl-like gene. 
The nucleotide sequence is closely-related to the Arabidopsis thaliana RSWl and rswl 
nucleotide sequences set forth herein (SEQ ID Nos:l, 3, 4, 5, 7, 9 and 11). The amino acid 
sequence set forth in SEQ ID NO: 14 relates to the polypeptide encoded by the RSWl or 

20 RSWIAike gene sequences set forth in SEQ ID No: 13. 

Those skilled in the art will be aware of procedures for the isolation of further cellulose genes 
to those specifically described herein, for example further cDNA sequences and genomic gene 
equivalents, when provided with one or more of the nucleotide sequences set forth in SEQ 

25 ID Nos:L 3, 4, 5, 7, 9, 11, or 13. In particular, hybridisations may be performed using one 
or more nucleic acid hybridisation probes comprising at least 10 contiguous nucleotides and 
preferably at least 50 contiguous nucleotides derived from the nucleotide sequences set forth 
herein, to isolate cDNA clones, mRNA molecules, genomic clones from a genomic library 
(in particular genomic clones containing the entire 5* upstream region of the gene including 

30 the promoter sequence, and the entire coding region and 3' -untranslated sequences), and/or 
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synthetic oligonucleotide molecules, amongst others. The present invention clearly extends 
to such related sequences. 

The invention further extends to any homologues, analogues or derivatives of any one or 
5 more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13. 

A further aspect of the present invention contemplates a nucleic acid molecule which encodes 
or is complementary to a nucleic acid molecule which encodes, a polypeptide which is 
required for cellulose biosynthesis in a plant, such as cellulose synthase, and which is capable 
10 of hybridising under at least low stringency conditions to the nucleic acid molecule set forth 
in any one or more of SEQ ID Nos:l, 3,4,5,7,9, 11 or 13, or to a complementary strand 
thereof. 

As an exemplification of this embodiment, the present inventors have shown that it is possible 
15 to isolate variants of the Arabidopsis thaliana RSWl gene sequence set forth in SEQ ID 
NO:3, by hybridization under low stringency conditions. Such variants include related 
sequences derived from Gossypium hirsutum (cotton), Eucalyptus ssp. and A. thaliana. 
Additional variant are clearly encompassed by the present invention. 

20 Preferably, the nucleic acid molecule further comprises a nucleotide sequence which encodes, 
or is complementary to a nucleotide sequence which encodes, a cellulose synthase 
polypeptide, more preferably a cellulose synthase which is associated with the primary or 
secondary plant cell wall of the plant species from which said nucleic acid molecule was 
derived. 

25 

More preferably, the nucleic acid molecule according to this aspect of the invention encodes 
or is complementary to a nucleic acid molecule which encodes, a polypeptide which is 
required for cellulose biosynthesis in a plant, such as cellulose synthase, and which is capable 
of hybridising under at least medium stringency conditions to the nucleic acid molecule set 
30 forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or to a complementary 
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strand thereof. 

Even more preferably, the nucleic acid molecule according to this aspect of the invention 
encodes or is complementary to a nucleic acid molecule which encodes, a polypeptide which 
5 is required for cellulose biosynthesis in a plant, such as cellulose synthase, and which is 
capable of hybridising under at least high stringency conditions to the nucleic acid molecule 
set forth in any one or more of SEQ ID Nos:l. 3, 4, 5, 7,9, 11 or 13, or to a complementary 
strand thereof. 

10 For the purposes of defining the level of stringency, a low stringency is defined herein as 
being a hybridisation and/or a wash carried out in 6xSSC buffer, 0.1 % (w/v) SDS at 28°C. 
Generally, the stringency is increased by reducing the concentration of SSC buffer, and/or 
increasing the concentration of SDS and/or increasing the temperature of the hybridisation 
and/or wash. A medium stringency comprises a hybridisation and/or a wash carried out in 

1 5 0.2xSSC-2xSSC buffer, 0. 1 % (w/v) SDS at 42 °C to 65 °C, while a high stringency comprises 
a hybridisation and/or a wash carried out in 0.1xSSC-0.2xSSC buffer, 0,1% (w/v) SDS at a 
temperature of at least 55 °C. Conditions for hybridisations and washes are well understood 
by one normally skilled in the art. For the purposes of further clarification only, reference 
to the parameters affecting hybridisation between nucleic acid molecules is found in pages 

20 2.10.8 to 2.10.16. of Ausubel et al. (1987), which is herein incorporated by reference. 

In an even more preferred embodiment of the invention, the isolated nucleic acid molecule 
further comprises a sequence of nucleotides which is at least 40% identical to at least 10 
contiguous nucleotides derived from any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 
25 13, or a complementary strand thereof. 

Still more preferably, the isolated nucleic acid molecule further comprises a sequence of 
nucleotides which is at least 40% identical to at least 50 contiguous nucleotides derived from 
the sequence set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a 
30 complementary strand thereof. 
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The present invention is particularly directed to a nucleic acid molecule which is capable of 
functioning as a cellulose gene as hereinbefore defined, for example a cellulose synthase gene 
such as. but not limited to, the Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, 
Brassica ssp., Gossypium hirsuium or Eucalyptus ssp. cellulose synthase genes, amongst 
5 others. The subject invention clearly contemplates additional cellulose genes to those 
specifically described herein which are derived from these plant species. 

The invention further contemplates other sources of cellulose genes such as but not limited 
to, tissues and cultured cells of plant origin. Preferred plant species according to this 
10 embodiment include hemp, jute, flax and woody plants including, but not limited to Pinus 
ssp. t Populus ssp., Piceaspp., amongst others. 

A genetic sequence which encodes or is complementary to a sequence which encodes a 
polypeptide which is involved in cellulose biosynthesis may correspond to the naturally 

15 occurring sequence or may differ by one or more nucleotide substitutions, deletions and/or 
additions. Accordingly, the present invention extends to cellulose genes and any functional 
genes, mutants, derivatives, parts, fragments, homologues or analogues thereof or non- 
functional molecules but which are at least useful as, for example, genetic probes, or primer 
sequences in the enzymatic or chemical synthesis of said gene, or in the generation of 

20 immunologically interactive recombinant molecules. 

In a particularly preferred embodiment, the cellulose genetic sequences are employed to 
identify and isolate similar genes from plant cells, tissues, or organ types of the same species, 
or from the cells, tissues, or organs of another plant species. 

25 

According to this embodiment, there is contemplated a method for identifying a related 
cellulose gene or related cellulose genetic sequence, for example a cellulose synthase or 
cellulose synthase-like gene, said method comprising contacting genomic DNA, or mRNA, 
or cDNA with a hybridisation effective amount of a first cellulose genetic sequence 
30 comprising any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a complementary 



BN8DOOD: <WO_9e0064SA1JLj* 



WO 98/00549 



PCT/AU97/00402 



- 14- 

sequence. homologue, analogue or derivative thereof derived from at least 10 contiguous 
nucleotides of said first sequence, and then detecting said hybridisation. 

Preferably, the first genetic sequence comprises at least 50 contiguous nucleotides, even more 
5 preferably at least 100 contiguous nucleotides and even more preferably at least 500 
contiguous nucleotides, derived from any one or more of SEQ ID Nos:l, 3,4,5.7,9, 11 or 
13, or a complementary strand, homologue, analogue or derivative thereof. 

The related cellulose gene or related cellulose genetic sequence may be in a recombinant 
10 form, in a virus particle, bacteriophage particle, yeast cell, animal cell, or a plant cell. 
Preferably, the related cellulose gene or related cellulose genetic sequence is derived from 
a plant species, such as a monocotyledonous plant or a dicotyledonous plant selected from the 
list comprising Arabidopsis thaliana, wheat, barley, maize, Brassica ssp., Gossypium 
hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., hemp, jute, flax, and woody plants 
15 including, but not limited to Pinus ssp., Populus ssp., Picea spp., amongst others. 

More preferably, related cellulose gene or related cellulose genetic sequence is derived from 
a plant which is useful in the fibre or timber industries, for example Gossypium hirsutum 
(cotton), hemp, jute, flax. Eucalyptus ssp. or Pinus ssp., amongst others. Alternatively, the 
20 related cellulose gene or related cellulose genetic sequence is derived from a plant which is 
useful in the cereal or starch industry, for example wheat, barley, rice or maize, amongst 
others. 

In a particularly preferred embodiment, the first cellulose genetic sequence is labelled with 
25 a reporter molecule capable of giving an identifiable signal (e.g. a radioisotope such as 32 P 
or 35 S or a biotinylated molecule). 

An alternative method contemplated in the present invention involves hybridising two nucleic 
acid "primer molecules" to a nucleic acid "template molecule" which comprises a related 
30 cellulose gene or related cellulose genetic sequence or a functional part thereof, wherein the 
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first of said primers comprises contiguous nucleotides derived from any one or more of SEQ 
ID Nos.l, 3, 4, 5, 7, 9, 11 or 13 or a homologue, analogue or derivative thereof and the 
second of said primers comprises contiguous nucleotides complementary to any one or more 
of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13. Specific nucleic acid molecule copies of the 
5 template molecule are amplified enzymatically in a polymerase chain reaction, a technique 
that is well known to one skilled in the an. 

In a preferred embodiment, each nucleic acid primer molecule is at least 10 nucleotides in 
length, more preferably at least 20 nucleotides in length, even more preferably at least 30 
10 nucleotides in length, still more preferably at least 40 nucleotides in length and even still 
more preferably at least 50 nucleotides in length. 

Furthermore, the nucleic acid primer molecules consists of a combination of any of the 
nucleotides adenine, cytidine, guanine, thymidine, or inosine, or functional analogues or 
15 derivatives thereof which are at least capable of being incorporated into a polynucleotide 
molecule without having an inhibitory effect on the hybridisation of said primer to the 
template molecule in the environment in which it is used. 

Furthermore, one or both of the nucleic acid primer molecules may be contained in an 
20 aqueous mixture of other nucleic acid primer molecules, for example a mixture of degenerate 
primer sequences which vary from each other by one or more nucleotide substitutions or 
deletions. Alternatively, one or both of the nucleic acid primer molecules may be in a 
substantially pure form. 

25 The nucleic acid template molecule may be in a recombinant form, in a virus particle, 
bacteriophage particle, yeast cell, animal cell, or a plant cell. Preferably, the nucleic acid 
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template molecule is derived from a plant cell, tissue or organ, in particular a cell, tissue or 
organ derived from a plant selected from the list comprising Arabidopsis thaliana % Oryza 
sativa, wheat, barley, maize, Brassica ssp. , Gossypium hirsutum and Eucalyptus ssp.. hemp, 
jute, flax, and woody plants including, but not limited to Pinus ssp., Populus ssp., Picea 
5 spp. t amongst others. 

Those skilled in the art will be aware that there are many known variations of the basic 
polymerase chain reaction procedure, which may be employed to isolate a related cellulose 
gene or related cellulose genetic sequence when provided with the nucleotide sequences set 
10 forth in any one or more of SEQ ID Nos.l, 3, 4, 5, 7, 9, 11 or 13. Such variations are 
discussed, for example, in McPherson et al (1991). The present invention extends to the use 
of all such variations in the isolation of related cellulose genes or related cellulose genetic 
sequences using the nucleotide sequences embodied by the present invention. 

1 5 The isolated nucleic acid molecule according to any of the further embodiments may be 
cloned into a plasmid or bacteriophage molecule, for example to facilitate the preparation of 
primer molecules or hybridisation probes or for the production of recombinant gene products. 
Methods for the production of such recombinant plasmids, cosmids, bacteriophage molecules 
or other recombinant molecules are well-known to those of ordinary skill in the art and can 

20 be accomplished without undue experimentation. Accordingly, the invention further extends 
to any recombinant plasmid, bacteriophage, cosmid or other recombinant molecule 
comprising the nucleotide sequence set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 
7, 9, 11 or 13, or a complementary sequence, homologue, analogue or derivative thereof. 

25 The nucleic acid molecule of the present invention is also useful for developing genetic 
constructs which express a cellulose genetic sequence, thereby providing for the increased 
expression of genes involved in cellulose biosynthesis in plants, selected for example from 
the list comprising Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp. , 
Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, and woody plants including, but 

30 not limited to Pinus ssp., Populus ssp. , Picea spp. t amongst others. The present invention 
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particularly contemplates the modification of cellulose biosynthesis in cotton, hemp, jute, 
flax. Eucalyptus ssp. and Pinus ssp. , amongst others. 

The present inventors have discovered that the genetic sequences disclosed herein are capable 
5 of being used to modify the level of non-crystalline ($-l,4 ( -giucan, in addition to altering 
cellulose levels when expressed, particularly when expressed in plants cells. In particular, the 
Arabidopsis thaliana rswl mutant has increased levels of non-crystalline P-l,4,-glucan, when 
grown at 31 °C, compared to wild-type plants, grown under identical conditions. The 
expression of a genetic sequence described herein in the antisense orientation in transgenic 
1 0 plants grown at only 21 °C is shown to reproduce many aspects of the rswl mutant phenotype. 

Accordingly, the present invention clearly extends to the modification of non-crystalline p- 
1,4,-glucan biosynthesis in plants, selected for example from the list comprising Arabidopsis 
thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypiwn hirsutum and 
15 Eucalyptus ssp., hemp, jute, flax, and woody plants including, but not limited to Pinus ssp>> 
Populus ssp., Picea spp., amongst others. The present invention particularly contemplates 
the modification of non-crystalline p-l,4,-glucan biosynthesis in cotton, hemp, jute, flax, 
Eucalyptus ssp. and Pinus ssp. , amongst others. 

20 The present invention further extends to the production and use of non-crystalline p- 1 ,4-glucan 
and to the use of the glucan to modify the properties of plant cell walls or cotton fibres or wood 
fibres. Such modified properties are described herein (Example 13). 

The inventors have discovered that the rswl mutant has altered carbon partitioning compared 
25 to wild-type plants, resulting in significantly higher starch levels therein. The isolated nucleic 
acid molecules provided herein are further useful for altering the carbon partitioning in a cell. 
In particular, the present invention contemplates increased starch production in transgenic 
plants expressing the nucleic acid molecule of the invention in the antisense orientation or 
alterntively, expressing a ribozyme or co-suppression molecule comprising the nucleic acid 
30 sequence of the invention. 
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The invention further contemplates reduced starch and/or non-crystalline p-1.4-glucan 
product in transgenic plants expressing the nucleic acid molecule of the invention in the sense 
orientation such that cellulose production is increased therein. 

5 Wherein it is desired to increase cellulose production in a plant cell, the coding region of a 
cellulose gene is placed operably behind a promoter, in the sense orientation, such that a 
cellulose gene product is capable of being expressed under the control of said promoter 
sequence. In a preferred embodiment, the cellulose genetic sequence is a cellulose synthase 
genomic sequence, cDNA molecule or protein-coding sequence. 

10 

In a particularly preferred embodiment, the cellulose genetic sequence comprises a sequence 
of nucleotides substantially the same as the sequence set forth in any one or more of SEQ ID 
Nos:l, 3, 4, 5,7, 9, 11 or 13 or a homologue, analogue or derivative thereof. 

15 Wherein it is desirable to reduce the content of cellulose or to increase the content of non- 
crystalline P-l ,4-glucan, the nucleic acid molecule of the present invention is expressed in the 
antisense orientation under the control of a suitable promoter. Additionally, the nucleic acid 
molecule of the invention is also useful for developing ribozyme molecules, or in co- 
suppression of a cellulose gene. The expression of an antisense, ribozyme or co-suppression 

20 molecule comprising a cellulose gene, in a cell such as a plant cell, fungal cell, insect cell, 
animal cell, yeast cell or bacterial cell, may also increase the solubility, digestibility or 
extractability of metabolites from plant tissues or alternatively, or increase the availability of 
carbon as a precursor for any secondary metabolite other than cellulose (e.g. starch or 
sucrose). By targeting the endogenous cellulose gene, expression is diminished, reduced or 

25 otherwise lowered to a level that results in reduced deposition of cellulose in the primary or 
secondary cell walls of the plant cell, fungal cell, insect cell, animal cell, yeast cell or 
bacterial cell, and more particularly, a plant cell. Additionally, or alternatively, the content 
of non-crystalline P-l,4-glucan is increased in such cells. 

30 Co-suppression is the reduction in expression of an endogenous gene that occurs when one 
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or more copies of said gene, or one or more copies of a substantially similar gene are 
introduced into the cell. The present invention also extends to the use of co-suppression to 
inhibit the expression of a gene which encodes a cellulose gene product, such as but not 
limited to cellulose synthase. Preferably, the co-suppression molecule of the present 
5 invention targets a plant mRNA molecule which encodes a cellulose synthase enzyme, for 
example a plant, fungus, or bacterial cellulose synthase mRNA, and more preferably a plant 
mRNA derived from Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp. , 
Gossypium hirsutum and Eucalyptus ssp. t hemp, jute, flax, or a woody plant such as Pinus 
ssp., Populus ssp., or Picea spp. f amongst others . 

10 

In a particularly preferred embodiment, the gene which is targeted by a co-suppression 
molecule, comprises a sequence of nucleotides set forth in any one or more of SEQ ID Nos: 1 , 
3, 4, 5, 7, 9, 11 or 13, or a complement, homologue, analogue or derivative thereof. 

15 In the context of the present invention, an antisense molecule is an RNA molecule which is 
transcribed from the complementary strand of a nuclear gene to that which is normally 
transcribed to produce a "sense" mRNA molecule capable of being translated into a 
polypeptide component of the cellulose biosynthetic pathway. The antisense molecule is 
therefore complementary to the mRNA transcribed from a sense cellulose gene or a part 

20 thereof. Although not limiting the mode of action of the antisense molecules of the present 
invention to any specific mechanism, the antisense RNA molecule possesses the capacity to 
form a double-stranded mRNA by base pairing with the sense mRNA, which may prevent 
translation of the sense mRNA and subsequent synthesis of a polypeptide gene product. 

25 Preferably, the antisense molecule of the present invention targets a plant mRNA molecule 
which encodes a cellulose gene product, for example cellulose synthase. Preferably, the 
antisense molecule of the present invention targets a plant mRNA molecule which encodes 
a cellulose synthase enzyme, for example a plant mRNA derived from Arabidopsis thaliana, 
Oryza sativa, wheat, barley, maize, Brassica ssp. , Gossypium hirsutum and Eucalyptus ssp., 

30 hemp, jute, flax, or a woody plant such as Pinus ssp., Populus ssp. t or Picea spp. t amongst 
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others. 

In a particularly preferred embodiment, the antisense molecule of the invention targets an 
mRNA molecule encoded by any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or 
5 a homologue, analogue or derivative thereof. 

Ribozymes are synthetic RNA molecules which comprise a hybridising region complementary 
to two regions, each of at least 5 contiguous nucleotide bases in the target sense mRNA. In 
addition, ribozymes possess highly specific endoribonuclease activity, which autocatalytically 
10 cleaves the target sense mRNA. A complete description of the function of ribozymes is 
presented by Haseloff and Gerlach (1988) and contained in International Patent Application 
No. WO89/05852. 

The present invention extends to ribozyme which target a sense mRNA encoding a cellulose 
15 gene product, thereby hybridising to said sense mRNA and cleaving it, such that it is no 
longer capable of being translated to synthesise a functional polypeptide product. Preferably, 
the ribozyme molecule of the present invention targets a plant mRNA molecule which encodes 
a cellulose synthase enzyme, for example a plant mRNA derived from Arabidopsis lhaliana, 
Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., hemp, jute, flax, or a 
20 woody plant such as Pinus ssp., Populus ssp., or Picea spp. f amongst others. 

In a particularly preferred embodiment, the ribozyme molecule will target an mRNA encoded 
by any one or more of SEQ ID Nos:l, 3 t 4, 5, 7, 9, 11 or 13, or a homologue, analogue or 
derivative thereof. 

25 

According to this embodiment, the present invention provides a ribozyme or antisense 
molecule comprising at least 5 contiguous nucleotide bases derived from any one or more of 
SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a complementary nucleotide sequence or a 
homologue, analogue or derivative thereof, wherein said antisense or ribozyme molecule is 
30 able to form a hydrogen-bonded complex with a sense mRNA encoding a cellulose gene 
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product to reduce translation thereof. 

In a preferred embodiment, the antisense or ribozyme molecule comprises at least 10 to 20 
contiguous nucleotides derived from any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 
5 13, or a complementary nucleotide sequence or a homologue, analogue or derivative thereof. 
Although the preferred antisense and/or ribozyme molecules hybridise to at least about 10 to 
20 nucleotides of the target molecule, the present invention extends to molecules capable of 
hybridising to at least about 50-100 nucleotide bases in length, or a molecule capable of 
hybridising to a full-length or substantially full-length mRNA encoded by a cellulose gene, 
10 such as a cellulose synthase gene. 

Those skilled in the art will be aware of the necessary conditions, if any, for selecting or 
preparing the antisense or ribozyme molecules of the invention. 

15 It is understood in the art that certain modifications, including nucleotide substitutions 
amongst others, may be made to the antisense and/or ribozyme molecules of the present 
invention, without destroying the efficacy of said molecules in inhibiting the expression of 
a gene encoding a cellulose gene product such as cellulose synthase. It is therefore within the 
scope of the present invention to include any nucleotide sequence variants, homologues. 

20 analogues, or fragments of the said gene encoding same, the only requirement being that said 
nucleotide sequence variant, when transcribed, produces an antisense and/or ribozyme 
molecule which is capable of hybridising to a sense mRNA molecule which encodes a 
cellulose gene product. 

25 Gene targeting is the replacement of an endogenous gene sequence within a cell by a related 
DNA sequence to which it hybridises, thereby altering the form and/or function of the 
endogenous gene and the subsequent phenotype of the cell. According to this embodiment, 
at least a part of the DNA sequence defined by any one or more of SEQ ID Nos: 1, 3, 4, 5, 
7, 9, 11 or 13, or a related cellulose genetic sequence, may be introduced into target cells 

30 containing an endogenous cellulose gene, thereby replacing said endogenous cellulose gene. 
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According to this embodiment, the polypeptide product of said cellulose genetic sequence 
possesses different catalytic activity and/or expression characteristics, producing in turn 
modified cellulose deposition in the target cell. In a particularly preferred embodiment of the 
invention, the endogenous cellulose gene of a plant is replaced with a gene which is merely 
5 capable of producing non-crystalline p-l,4-glucan polymers or alternatively which is capable 
of producing a modified cellulose having properties similar to synthetic fibres such as rayon, 
in which the P-l,4-glucan polymers are arranged in an antiparallel configuration relative to 
one another. 

1 0 The present invention extends to genetic constructs designed to facilitate expression of a 
cellulose genetic sequence which is identical, or complementary to the sequence set forth in 
any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a functional derivative, part, 
homologue, or analogue thereof, or a genetic construct designed to facilitate expression of 
a sense molecule, an antisense molecule, ribozyme molecule, co-suppression molecule, or 

15 gene targeting molecule containing said genetic sequence. 

The said genetic construct of the present invention comprises the foregoing sense, antisense, 
or ribozyme, or co-suppression nucleic acid molecule, or gene-targeting molecule, placed 
operably under the control of a promoter sequence capable of regulating the expression of the 
20 said nucleic acid molecule in a prokaryotic or eukaryotic cell, preferably a plant cell. The 
said genetic construct optionally comprises, in addition to a promoter and sense, or antisense, 
or ribozyme, or co-suppression, or gene-targeting nucleic acid molecule, a terminator 
sequence. 

25 The term "terminator" refers to a DNA sequence at the end of a transcriptional unit which 
signals termination of transcription. Terminators are 3 '-non-translated DNA sequences 
containing a polyadenylation signal, which facilitates the addition of polyadenylate sequences 
to the 3 '-end of a primary transcript. Terminators active in plant cells are known and 
described in the literature. They may be isolated from bacteria, fungi, viruses, animals 

30 and/or plants. Examples of terminators particularly suitable for use in the genetic constructs 
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of the present invention include the nopaline synthase (NOS) gene terminator of 
Agrobacterium tumefaciens, the terminator of the Cauliflower mosaic virus (CaMV) 35S 
gene, and the zein gene terminator from Zea mays. 

5 Reference herein to a "promoter" is to be taken in its broadest context and includes the 
transcriptional regulatory sequences of a classical genomic gene, including the TATA box 
which is required for accurate transcription initiation, with or without a CCAAT box 
sequence and additional regulatory elements (i.e. upstream activating sequences, enhancers 
and silencers) which alter gene expression in response to developmental and/or external 
10 stimuli, or in a tissue-specific manner. A promoter is usually, but not necessarily, positioned 
upstream or 5', of a structural gene, the expression of which it regulates. Furthermore, the 
regulatory elements comprising a promoter are usually positioned within 2 kb of the start site 
of transcription of the gene. 

15 In the present context, the term "promoter" is also used to describe a synthetic or fusion 
molecule, or derivative which confers, activates or enhances expression of said sense, 
antisense, or ribozyme, or co-suppression nucleic acid molecule, in a plant cell. Preferred 
promoters may contain additional copies of one or more specific regulatory elements, to 
further enhance expression of a sense antisense, ribozyme or co-suppression molecule and/or 

20 to alter the spatial expression and/or temporal expression of said sense or antisense, or 
ribozyme, or co-suppression, or gene-targeting molecule. For example, regulatory elements 
which confer copper inducibility may be placed adjacent to a heterologous promoter sequence 
driving expression of a sense, or antisense, or ribozyme, or co-suppression, or gene-targeting 
molecule, thereby conferring copper inducibility on the expression of said molecule. 

25 

Placing a sense or ribozyme, or antisense, or co-suppression, or gene-targeting molecule 
under the regulatory control of a promoter sequence means positioning the said molecule such 
that expression is controlled by the promoter sequence. Promoters are generally positioned 
5' (upstream) to the genes that they control. In the construction of heterologous 
30 promoter/structural gene combinations it is generally preferred to position the promoter at a 
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distance from the gene transcription start site that is approximately the same as the distance 
between that promoter and the gene it controls in its natural setting, i.e., the gene from which 
the promoter is derived. As is known in the art, some variation in this distance can be 
accommodated without loss of promoter function. Similarly, the preferred positioning of a 
5 regulatory sequence element with respect to a heterologous gene to be placed under its control 
is defined by the positioning of the element in its natural setting, i.e., the genes from which 
it is derived. Again, as is known in the art, some variation in this distance can also occur. 

Examples of promoters suitable for use in genetic constructs of the present invention include 
10 viral, fungal, bacterial, animal and plant derived promoters capable of functioning in 
prokaryotic or eukaryotic cells. Preferred promoters are those capable of regulating the 
expression of the subject cellulose genes of the innvention in plants cells, fungal cells, insect 
cells, yeast cells, animal cells or bacterial cells, amongst others. Particularly preferred 
promoters are capable of regulating expression of the subject nucleic acid molecules in plant 
15 cells. The promoter may regulate the expression of the said molecule constitutively, or 
differentially with respect to the tissue in which expression occurs or, with respect to the 
developmental stage at which expression occurs, or in response to external stimuli such as 
physiological stresses, or plant pathogens, or metal ions, amongst others. Preferably, the 
promoter is capable of regulating expression of a sense, or ribozyme, or antisense, or co- 
20 suppression molecule or gene targeting, in a plant cell. Examples of preferred promoters 
include the CaMV 35S promoter, NOS promoter, octopine synthase (OCS) promoter and the 
like. 

In a most preferred embodiment, the promoter is capable of expression in any plant cell, such 
25 as, but not limited to a plant selected from the list comprising Arabidopsis thaliana, Oryza 
saliva, wheat, barley, maize, Brassica ssp. , Gossypium hirsutum and Eucalyptus ssp.. hemp, 
jute, flax, and woody plants including, but not limited to Pinus ssp. t Populus ssp., Picea 
spp., amongst others. 

30 In a particularly preferred embodiment, the promoter may be derived from a genomic clone 
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encoding a cellulose gene product, in particular the promoter contained in the sequence set 
forth in SEQ ID NO:3 or SEQ ID NO:4. Preferably, the promoter sequence comprises 
nucleotide 1 to about 1900 of SEQ ID NO:3 or nucleotides 1 to about 700 of SEQ ID NO:4 
or a homologue, analogue or derivative capable of hybridizing thereto under at least low 
5 stringency conditions. 

Optionally, the genetic construct of the present invention further comprises a terminator 
sequence. 

10 In an exemplification of this embodiment, there is provided a binary genetic construct 
comprising the isolated nucleotide sequence of nucleotides set forth in SEQ ID NO:3. There 
is also provided a genetic construct comprising the isolated nucleotide sequence of nucleotides 
set forth in SEQ ID NO:l, in the antisense orientation, placed operably in connection with 
the CaMV 35S promoter. 

15 

In the present context, the term M in operable connection with" means that expression of the 
isolated nucleotide sequence is under the control of the promoter sequence with which it is 
connected, regardless of the relative physical distance of the sequences from each other or 
their relative orientation with respect to each other. 

20 

An alternative embodiment of the invention is directed to a genetic construct comprising a 
promoter or functional derivative, part, fragment, homologue, or analogue thereof, which is 
capable of directing the expression of a polypeptide early in the development of a plant cell 
at a stage when the cell wall is developing, such as during cell expansion or during cell 
25 division. In a particularly preferred embodiment, the promoter is contained in the sequence 
set forth in SEQ ID NO:3 or SEQ ID NO:4. Preferably, the promoter sequence comprises 
nucleotide 1 to about 1900 of SEQ ID NO:3 or nucleotides 1 to about 700 of SEQ ID NO:4 
or a homologue, analogue or derivative capable of hybridizing thereto under at least low 
stringency conditions. 

30 
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The polypeptide may be a reporter molecule which is encoded by a gene such as the bacterial 
P-glucuronidase gene or chloramphenicol acetyltransferase gene or alternatively, the firefly 
luciferase gene. Alternatively, the polypeptide may be encoded by a gene which is capable 
of producing a modified cellulose in the plant cell when placed in combination with the 
5 normal complement of cellulose genes which are expressible therein, for example it may be 
a cellulose-like gene obtained from a bacterial or fungal source or a cellulose gene obtained 
from a plant source. 

The genetic constructs of the present invention are particularly useful in the production of 
10 crop plants with altered cellulose content or structure. In particular, the rate of cellulose 
deposition may be reduced leading to a reduction in the total cellulose content of plants by 
transferring one or more of the antisense, ribozyme or co-suppression molecules described 
supra into a plant or alternatively, the same or similar end-result may be achieved by 
replacing an endogenous cellulose gene with an inactive or modified cellulose gene using 
1 5 gene-targeting approaches. The benefits to be derived from reducing cellulose content in 
plants are especially apparent in food and fodder crops such as, but not limited to maize, 
wheat, barley, rye, rice, barley, millet or sorghum, amongst others where improved 
digestibility of said crop is desired. The foregoing antisense, ribozyme or co-suppression 
molecules are also useful in producing plants with altered carbon partitioning such that 
20 increased carbon is available for growth, rather than deposited in the form of cellulose. 

Alternatively, the introduction to plants of additional copies of a cellulose gene in the 4 sense* 
orientation and under the control of a strong promoter is useful for the production of plants 
with increased cellulose content or more rapid rates of cellulose biosynthesis. Accordingly, 

25 such plants may exhibit a range of desired traits including, but not limited to modified 
strength and/or shape and/or properties of fibres, cell and plants, increased protection against 
chemical, physical or environmental stresses such as dehydration, heavy metals (e.g. 
cadmium) cold, heat or wind, increased resistance to attack by pathogens such as insects, 
nematodes and the like which physically penetrate the cell wall barrier during 

30 invasion/ infection of the plant. 



BN80OOD: <WO_p60064«*1JU* 



WO 98/00549 



PCT/AU97/00402 



-27 - 

Alternatively, the production of plants with altered physical properties is made possible by 
the introduction thereto of altered cellulose gene(s). Such plants may produce p-l,4-glucan 
which is either non-crystalline or shows altered crystallinity. Such plants may also exhibit 
a range of desired traits including but not limited to, altered dietary fibre content, altered 
5 digestibility and degradability or producing plants with altered extractability properties. 

Furthermore, genetic constructs comprising a plant cellulose gene in the 'sense' orientation 
may be used to complement the existing range of cellulose genes present in a plant, thereby 
altering the composition or timing of deposition of cellulose deposited in the cell wall of said 
1 0 plant. In a preferred embodiment, the cellulose gene from one plant species or a P-l ,4-glucan 
synthase gene from a non-plant species is used to transform a plant of a different species, 
thereby introducing novel cellulose biosynthetic metabolism to the second-mentioned plant 
species. 

15 In a related embodiment, a recombinant fusion polypeptide may be produced containing the 
active site from one cellulose gene product fused to another cellulose gene product, wherein 
said fusion polypeptide exhibits novel catalytic properties compared to either 'parent 1 
polypeptide from which it is derived. Such fusion polypeptides may be produced by 
conventional recombinant DNA techniques known to those skilled in the art, either by 

20 introducing a recombinant DNA capable of expressing the entire fusion polypeptide into said 
plant or alternatively, by a gene-targeting approach in which recombination at the DNA level 
occurs in vivo and the resultant gene is capable of expressing a recombinant fusion 
polypeptide. 

25 The present invention extends to all transgenic methods and products described supra, 
including genetic constructs. 

The recombinant DNA molecule carrying the sense, antisense, ribozyme or co-suppression 
molecule of the present invention and/or genetic construct comprising the same, may be 
30 introduced into plant tissue, thereby producing a "transgenic plant 1 ', by various techniques 
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known to those skilled in the art. The technique used for a given plant species or specific 
type of plant tissue depends on the known successful techniques. Means for introducing 
recombinant DNA into plant tissue include, but are not limited to, transformation 
(Paszkowski et aL, 1984), electroporation (Fromm et aL, 1985), or microinjection of the 
5 DNA (Crossway et aL , 1986), or T-DNA-mediated transfer from Agrobacterium to the plant 
tissue. Representative T-DNA vector systems are described in the following references: An 
et aL (1985); Herrera-Estrella et aL (1983a, b); Herrera-Estrella et aL (1985). Once 
introduced into the plant tissue, the expression of the introduced gene may be assayed in a 
transient expression system, or it may be determined after selection for stable integration 
10 within the plant genome. Techniques are known for the in vitro culture of plant tissue, and 
in a number of cases, for regeneration into whole plants. Procedures for transferring the 
introduced gene from the originally transformed plant into commercially useful cultivars are 
known to those skilled in the art. 

1 5 A still further aspect of the present invention extends to a transgenic plant such as a crop 
plant, carrying the foregoing sense, antisense, ribozyme, co-suppression, or gene-targeting 
molecule and/or genetic constructs comprising the same. Preferably, the transgenic plant is 
one or more of the following :Arabidopsis thaliana, Oryza sativa y wheat, barley, maize, 
Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, Pinus ssp., 

20 Populus ssp., or Picea spp. Additional species are not excluded. 

The present invention further extends to the progeny of said transgenic plant. 

Yet another aspect of the present invention provides for the expression of the subject genetic 
25 sequence in a suitable host (e.g. a prokaryote or eukaryote) to produce full length or non-full 
length recombinant cellulose gene products. 

Hereinafter the term "cellulose gene product" shall be taken to refer to a recombinant product 
of a cellulose gene as hereinbefore defined. Accordingly, the term "cellulose gene product" 
30 includes a polypeptide product of any gene involved in the cellulose biosynthetic pathway in 
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plants, such as, but not limited to a cellulose synthase gene product. 

Preferably, the recombinant cellulose gene product comprises an amino acid sequence having 
the catalytic activity of a cellulose synthase polypeptide or a functional mutant, derivative 
5 part, fragment, or analogue thereof. 

In a particularly preferred embodiment of the invention, the recombinant cellulose gene 
product comprises a sequence or amino acids that is at least 40% identical to any one or more 
of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or a homologue, analogue or derivative thereof. 

10 

Single and three-letter abbreviations used for amino acid residues contained in the 
specification are provided in Table 1 . 

In the present context, "homologues" of an amino acid sequence refer to those polypeptides, 
15 enzymes or proteins which have a similar catalytic activity to the amino acid sequences 
described herein, notwithstanding any amino acid substitutions, additions or deletions thereto. 
A homologue may be isolated or derived from the same or another plant species as the species 
from which the polypeptides of the invention are derived. 

20 "Analogues" encompass polypeptides of the invention notwithstanding the occurrence of any 
non-naturally occurring amino acid analogues therein. 

" Derivatives " include modified peptides in which ligands are attached to one or more of the 
amino acid residues contained therein, such as carbohydrates, enzymes, proteins, polypeptides 

25 or reporter molecules such as radionuclides or fluorescent compounds. Glycosylated, 
fluorescent, acylated or alkylated forms of the subject peptides are particularly contemplated 
by the present invention. Additionally, derivatives of an amino acid sequence described 
herein which comprises fragments or parts of the subject amino acid sequences are within the 
scope of the invention, as are homopolymers or heteropolymers comprising two or more 

30 copies of the subject polypeptides. Procedures for derivatizing peptides are well-known in the 
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art. 

TABLE 1 



Amino Acid Three-letter One-letter 

Abbreviation Symbol 

5 Alanine Ala A 

Arginine Arg R 

Asparagine Asn N 

Aspanic acid Asp D 

Cysteine Cys C 

10 D-alanine Dal X 

Glutamine Gin Q 

Glutamic acid Glu E 

Glycine Gly G 

Histidine His H 

15 Isoleucine He ] 

Leucine Leu L 

Lysine Lys K 

Methionine Met M 

Phenylalanine Phe p 

20 Proline p ro p 

Serine Ser S 

Threonine Thr T 

Tryptophan Trp W 

Tryosine Ty r Y 

25 Valine Val V 

Any amino acid Xaa X 
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Substitutions encompass amino acid alterations in which an amino acid is replaced with a 
different naturally-occurring or a non-conventional amino acid residue. Such substitutions 
may be classified as "conservative", in which an amino acid residue contained in a cellulose 
gene product is replaced with another naturally-occurring amino acid of similar character, for 
5 example Gly^Ala, Val<-»Ile<~+Leu, Asp*-»Glu, Lys<-*Arg, Asn<-*Gln or Phe«-+Trp<-*Tyr. 

Substitutions encompassed by the present invention may also be "non-conservative", in which 
an amino acid residue which is present in a cellulose gene product described herein is 
substituted with an amino acid with different properties, such as a naturally-occurring amino 
10 acid from a different group (eg. substituted a charged or hydrophobic amino acid with 
alanine), or alternatively, in which a naturally-occurring amino acid is substituted with a non- 
conventional amino acid. 

Non-conventional amino acids encompassed by the invention include, but are not limited to 
15 those listed in Table 2. 

Amino acid substitutions are typically of single residues, but may be of multiple residues, 
either clustered or dispersed. 

20 Amino acid deletions will usually be of the order of about 1-10 amino acid residues, while 
insertions may be of any length. Deletions and insertions may be made to the N -terminus, 
the C -terminus or be internal deletions or insertions. Generally, insertions within the amino 
acid sequence will be smaller than amino- or carboxy -terminal fusions and of the order of 1-4 
amino acid residues. 

25 

A homologue, analogue or derivative of a cellulose gene product as referred to herein may 
readily be made using peptide synthetic techniques well-known in the art, such as solid phase 
peptide synthesis and the like, or by recombinant DNA manipulations. Techniques for 
making substituent mutations at pre-determined sites using recombinant DNA technology, for 
30 example by Ml 3 mutagenesis, are also well-known. The manipulation of nucleic acid 
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molecules to produce variant peptides, polypeptides or proteins which manifest as 
substitutions, insertions or deletions are well-known in the art. 

The cellulose gene products described herein may be derivatized further by the inclusion or 
5 attachment thereto of a protective group which prevents , inhibits or slows proteolytic or 
cellular degradative processes. Such derivatization may be useful where the half-life of the 
subject polypeptide is required to be extended, for ample to increase the amount of cellulose 
produced in a primary or secondary cell wall of a plant cell or alternatively, to increase the 
amount of protein produced in a bacterial or eukaryotic expression system. Examples of 

10 chemical groups suitable for this purpose include, but are not limited to, any of the non- 
conventional amino acid residues listed in Table 2, in particular a D-stereoisomer or a 
methylated form of a naturally-occurring amino acid listed in Table 1. Additional chemical 
groups which are useful for this purpose are selected from the list comprising aryl or 
heterocyclic N-acyl substituents, poly alky lene oxide moieties, desulphatohirudin muteins, 

15 alpha-muteins, alpha-aminophosphonic acids, water-soluble polymer groups such as 
polyethylene glycol attached to sugar residues using hydrazone or oxime groups, 
benzodiazepine dione derivatives, glycosyl groups such as beta-glycosylamine or a derivative 
thereof, isocyanate conjugated to a polyol functional group or polyoxyethylene polyol capped 
with diisocyanate, amongst others. Similarly, a cellulose gene product or a homologue, 

20 analogue or derivative thereof may be cross-linked or fused to itself or to a protease inhibitor 
peptide, to reduce susceptibility of said molecule to proteolysis. 
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TABLE 2 



Non-conventional 
amino acid 

5 


Code 


Non-conventional 
amino acid 


Code 


a-aminobutyric acid 


Abu 


L-N-methylalanine 


Nmala 


a-amino-a-methylbutyrate 


Mgabu 


L-N-methylarginine 


Nmarg 


aminocyclopropane- 


Cpro 


L-N-methylasparagine 


Nmasn 


carboxylate 




L-N-methylaspartic acid 


Nmasp 


10 aminoisobutyric acid 


Aib 


L-N-methylcysteine 


Nmcys 


aminonorbornyl- 


Norb 


L-N-methylglutamine 


Nmgln 


carboxylate 




L-N-methylglutamic acid 


Nmglu 


cyclohexylalanine 


Chexa 


L-N-methylhistidine 


Nmhis 


cyclopentylalanine 


Cpen 


L-N-methylisolleucine 


Nmile 


1 5 D-alanine 


Dal 


L-N-methylleucine 


Nmleu 


D~arginine 


Darg 


L-N-methyllysine 


Nmlys 


D-aspartic acid 


Dasp 


L-N -methy lmethionine 


Nmmet 


D-cysteine 


Dcys 


L-N-methylnorleucine 


Nmnle 


D-glutamine 


Dgln 


L-N-methylnorvaline 


Nmnva 


20 D-glutamic acid 


Dglu 


L-N-methylornithine 


Nmorn 


D-histidine 


Dhis 


L-N-methylphenylalanine 


Nmphe 


D-isoleucine 


Dile 


L-N-methylproIine 


Nmpro 


D-leucine 


DIeu 


L-N-methylserine 


Nmser 


D-lysine 


Dlys 


L-N-methylthreonine 


Nmthr 


25 D-methionine 


Dmet 


L-N -methy Itryptophan 


Nmtrp 


D-ornithine 


Dorn 


L-N-methyltyrosine 


Nmtyr 


D-phenylalanine 


Dphe 


L-N-methylvaline 


Nmval 


D-proline 


Dpro 


L-N-methylethylglycine 


Nmetg 


D-serine 


Dser 


L-N-methyl-t-butylglycine 


Nmtbug 


30 D- threonine 


Dthr 


L-norleucine 


Nle 
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D-tryptophan 


Dtrp 


L-norvaline 


Nva 


D-tyrosine 


Dtyr 


a-methyl-aminoisobutyrate 


Maib 


D-valine 


Dval 


a-methyl-Y-aminobutyrace 


Mgabu 


D-a-methylalanine 


Dmala 


a-methylcyclohexylalanine 


Mchexa 


5 D-a-methylarginine 


Dmarg 


a-methy Icy Icopenty lalanine 


Mcpen 


D-a-methylasparagine 


Dmasn 


a-methyl-a-napthylalanine 


Manap 


D-a-methylaspartate 


Dmasp 


a-methylpenicillamine 


Mpen 


D-a-methylcysteine 


Dmcys 


N-(4-aminobutyl)glycine 


Nglu 


D-a-methvlelutamine 


Dmgln 


N-(2-aminoethyl)glycine 


Naeg 


1 0 D-a-methvlhistidine 


Dmhis 


N-(3-aminopropyl)glycine 


Norn 


D-ot-methvlisoleucine 


Dmile 


N-amino-a-methylbutyrate 


Nmaabu 


D-oc-methvI leucine 


Dmleu 


a-napthylalanine 


Anap 


D-a-methvllvsine 


Dmlys 


N-benzylglycine 


Nphe 


D-a-methylmethionine 


Dmmet 


N-(2-carbamylethyl)glycine 


Ngin 


1 5 D-a-methylornithine 


Dmorn 


N-(carbamylmethyl)g]ycine 


Nasi) 


D-a-methylphenylalanine 


Dmphe 


N-(2-carboxyethyl)glycine 


Nglu 


D-a-methylproline 


Dmpro 


N-(carboxymethyl)glycine 


Nasp 


D-a-methylserine 


Dmser 


N-cyclobutylglycine 


Ncbut 


D-a-methylthreonine 


Dmthr 


N-cycloheptylglycine 


Nchep 


20 D-a-me thy 1 tryptophan 


Dmtrp 


N-cyclohexylglycine 


Nchex 


D-a-methyltyrosine 


Dmty 


N-cyclodecylglycine 


Ncdec 


D-a-methylvaline 


Dmval 


N-cylcododecylglycine 


Ncdod 


D-N-methylalanine 


Dnmala 


N-cyclooctylglycine 


Ncoct 


D-N-methylarginine 


Dnmarg 


N-cyclopropylglycine 


Ncpro 


ZD D-iN-metnyiasparagine 


unmasn 


in -cyciounaecy igiycme 


in cunu 


D-N-methylaspartate 


Dnmasp 


N-(2,2-diphenylethyl)glycine 


Nbhm 


D-N-methylcysteine 


Dnmcys 


N-(3,3-diphenylpropyl)glycine 


Nbhe 


D-N-methylglutamine 


Dnmgln 


N-(3-guanidinopropyI)gIycine 


Narg 


D-N-methylglutamate 


Dnmglu 


N-(l-hydroxyethyl)glycine 


Nthr 


30 D-N-methylhistidine 


Dnmhis 


N-(hydroxyethyl))glycine 


Nser 
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D-N-methylisoleucine Dnmile 

D- N -me thy I leuc i ne Dnmleu 

D-N-methyllysine Dnmlys 

N-methylcyclohexylalanine Nmchexa 

5 D-N-methylornithine Dnmom 

N-methylglycine Nala 

N-methylaminoisobutyrate Nmaib 

N-( 1 -methylpropyl)glycine Nile 

N-(2-methylpropyl)glycine Nleu 

10 D-N-methyltryptophan Dnmtrp 

D-N-methyltyrosine Dnmtyr 

D-N-methylvaline Dnmval 

y-aminobutyric acid Gabu 

L-r-butylglycine Tbug 

15 L-ethylglycine Etg 

L-homopheny 1 al anine Hphe 

L-a-methylarginine Marg 

L-a-methylaspartate Masp 

L-ct-methylcysteine Mcys 

20 L-a-methylglutamine Mgln 

L-a-methylhistidine Mhis 

L-a-methylisoleucine Mile 

L-a-methylleucine Mleu 

L-a-methylmethionine Mmet 

25 L-a-methylnorvaline Mnva 

L-a-methylphenylalanine Mphe 

L-a-methy Iser ine Mser 

L-a-methyltryptophan Mtrp 

L-a-methylvaline Mval 
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in -^uiiiud/.uiyiciiiy i ) ^glycine 


Mhic 


in j-inuoiyiyeinyi ygiycine 


iNntrp 


N -methy 1 - y -am inobu ty r a te 


Nmgabu 


L/-in -metny imetnionme 


Dnmmet 


in -metny icyctopentylalanine 


Nmcpen 


u-iN -metny lpneny la lanme 


unmpne 


D-N -methy lprol ine 


Dnmpro 


D-N-methylserine 


Dnmser 


u-xn -metny itnreonine 


L/nmtnr 


is-{ i -metny letnyi^giycine 


in vai 


in -meiny la-napuiy i aiamne 


Mm anon 

iNinandp 


iN-mcuiyipcniciuaminc 


in mpcn 


lN-^-nyuroxypncnyi ^glycine 


iNniyr 


in -vinionieiny l ^glycine 


iNcyb 


penicillamine 


Don 


L-ot- metny i alanine 


Mala 


L-a-methylasparagine 


Masn 


L-a-methyl-f-butylglycine 


Mtbug 


L-methylethylglycine 


Metg 


L-a-metnylglutamate 


Mglu 


L-ct-metnyinomopnenyiaianine 


Mnpne 


lN-(2-metnyltnioetnyl)glycine 


Nmet 


i^-a-metnyiiysine 


Miys 


L-a-methylnorleucine 


Mnle 


L-a-methylornithine 


Morn 


L-a-methylproline 


Mpro 


L-a-methylthreonine 


Mthr 


L-a-methyltyrosine 


Mtyr 


L-N-methylhomophenylalanine 


Nmhphe 
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N-(N-(2.2-diphenylethyI) Nnbhm N-(N-(3,3-diphenylpropyl) Nnbhe 

carbamylmethyOglycine carbamylmethyl)glycine 

1 -carboxy- 1 -(2 ,2-dipheny 1- Nmbc 
ethylamino)cyclopropane 



In an alternative embodiment of the invention, the recombinant cellulose gene product is 
characterised by at least one functional p-glycosyl transferase domain contained therein. 

10 

The term u P-glycosyl transferase domain" as used herein refers to a sequence of amino acids 
which is highly conserved in different processive enzymes belonging to the class of glycosyl 
transferase enzymes (Saxena et aL, 1995), for example the bacterial P-l ,4-glycosyl 
transferase enzymes and plant cellulose synthase enzymes amongst others, wherein said 
1 5 domain possesses a putative function in contributing to or maintaining the overall catalytic 
activity, substrate specificity or substrate binding of an enzyme in said enzyme class. The 
p-glycosyl transferase domain is recognisable by the occurrence of certain amino acid 
residues at particular locations in a polypeptide sequence, however there is no stretch of 
contiguous amino acid residues comprised therein. 

20 

As a consequence of the lack of contiguity in a P-glycosyl transferase domain, it is not a 
straightforward matter to isolate a cellulose gene by taking advantage of the presence of a 
P-glycosyl transferase domain in the polypeptide encoded by said gene. For example, the 
p-glycosyl transferase domain would not be easily utilisable as a probe to facilitate the rapid 
25 isolation of all p-glycosyl transferase genetic sequences from a particular organism and then 
to isolate from those genetic sequences a cellulose gene such as cellulose synthase. 

In a preferred embodiment, the present invention provides an isolated polypeptide which: 
(i)contains at least one structural P-glycosyl transferase domain as hereinbefore 
30 defined; and 
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(ii) has at least 40% amino acid sequence similarity to at least 20 contiguous 
amino acid residues set forth in any one or more of SEQ ID Nos:2, 6, 8 r 10, 12 or 
14, or a homologue, analogue or derivative thereof. 

5 More preferably, the polypeptide of the invention is at least 40% identical to at least 50 
contiguous amino acid residues, even more preferably at least 100 amino acid residues of 
any one or more of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or a homologue, analogue or 
derivative thereof. 



10 In a particularly preferred embodiment, the percentage similarity to any one or more of SEQ 
ID Nos:2, 6, 8, 10, 12 or 14 is at least 50-60%, more preferably at least 65-70%, even 
more preferably at least 75-80% and even more preferably at least 85-90%, including about 
91 % or 95%. 

15 In a related embodiment, the present invention provides a "sequencably pure" form of the 
amino acid sequence described herein. "Sequencably pure" is hereinbefore described as 
substantially homogeneous to facilitate amino acid determination. 

In a further related embodiment, the present invention provides a " substantially 
20 homogeneous" form of the subject amino acid sequence, wherein the term "substantially 
homogeneous" is hereinbefore defined as being in a form suitable for interaction with an 
immunologically interactive molecule. Preferably, the polypeptide is at least 20% 
homogeneous, more preferably at least 50% homogeneous, still more preferably at least 
75% homogeneous and yet still more preferably at least about 95-100% homogenous, in 
25 terms of activity per microgram of total protein in the protein preparation. 

The present invention further extends to a synthetic peptide of at least 5 amino acid residues 
in length derived from or comprising a part of the amino acid sequence set forth in any one 
or more of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or having at least 40% similarity thereto. 

30 
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Those skilled in the art will be aware that such synthetic peptides may be useful in the 
production of immunologically interactive molecules for the preparation of antibodies or as 
the peptide component of an immunoassay . 

5 The invention further extends to an antibody molecule such as a polyclonal or monoclonal 
antibody or an immunologically interactive part or fragment thereof which is capable of 
binding to a cellulose gene product according to any of the foregoing embodiments. 

The term "antibody" as used herein, is intended to include fragments thereof which are also 
1 0 specifically reactive with a polypeptide of the invention. Antibodies can be fragmented using 
conventional techniques and the fragments screened for utility in the same manner as for whole 
antibodies. For example, F(ab')2 fragments can be generated by treating antibody with pepsin. 
The resulting F(ab')2 fragment can be treated to reduce disulfide bridges to produce Fab' 
fragments. 

15 

Those skilled in the art will be aware of how to produce antibody molecules when provided 
with the cellulose gene product of the present invention. For example, by using a polypeptide 
of the present invention polyclonal antisera or monoclonal antibodies can be made using 
standard methods. A mammal, (e.g., a mouse, hamster, or rabbit) can be immunized with an 

20 immunogenic form of the polypeptide which elicits an antibody response in the mammal. 
Techniques for conferring immunogenicity on a polypeptide include conjugation to carriers 
or other techniques well known in the art. For example, the polypeptide can be administered 
in the presence of adjuvant. The progress of immunization can be monitored by detection of 
antibody titers in plasma or serum. Standard ELIS A or other immunoassay can be used with 

25 the immunogen as antigen to assess the levels of antibodies. Following immunization, antisera 
can be obtained and, if desired IgG molecules corresponding to the polyclonal antibodies may 
be isolated from the sera. 

To produce monoclonal antibodies, antibody producing cells (lymphocytes) can be harvested 
30 from an immunized animal and fused with myeloma cells by standard somatic cell fusion 
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procedures thus immortalizing these cells and yielding hybridoma cells. Such techniques are 
well known in the art. For example, the hybridoma technique originally developed by Kohler 
and Milstein ( 1 975) as well as other techniques such as the human B-cell hybridoma technique 
(Kozbor et al., 1 983), the EB V-hybridoma technique to produce human monoclonal antibodies 
5 (Cole et a/., 1985), and screening of combinatorial antibody libraries (Huse et a/., 1989). 
Hybridoma cells can be screened immunochemically for production of antibodies which are 
specifically reactive with the polypeptide and monoclonal antibodies isolated. 

As with all immunogenic compositions for eliciting antibodies, the immunogenically effective 
1 0 amounts of the polypeptides of the invention must be determined empirically. Factors to be 
considered include the immunogenicity of the native polypeptide, whether or not the 
polypeptide will be complexed with or covalently attached to an adjuvant or carrier protein or 
other carrier and route of administration for the composition, i.e. intravenous, intramuscular, 
subcutaneous, e/c, and the number of immunizing doses to be administered. Such factors are 
15 known in the vaccine art and it is well within the skill of immunologists to make such 
determinations without undue experimentation. 

It is within the scope of this invention to include any second antibodies (monoclonal, 
polyclonal or fragments of antibodies) directed to the first mentioned antibodies discussed 
20 above. Both the first and second antibodies may be used in detection assays or a first antibody 
may be used with a commercially available antiimmunoglobulin antibody. 

The present invention is further described by reference to the following non-limiting Figures 
and Examples. 

25 
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In the Figures: 

Figure 1 is a photographic representation showing the inflorescence length of wild- type 
Arabidopsis thaliana Columbia plants (plants 1 and 3) and rswl plants (plants 2 and 4) 
5 grown at 21 °C (plants 1 and 2) or 31 °C. Plants were grown initially at 21 °C until bolting 
commenced, the bolts were removed and the re-growth followed in plants grown at each 
temperature. 

Figure 2 is a photographic representation of a cryo-scanning electron micrograph showing 
10 misshapen epidermal cells in the cotyledons and hypocotyl of the rswl mutant when grown 
at31°C for 10 days. 

Figure 3 is a graphical representation of a gas chromatograph of alditol acetates of 
methylated sugars from a cellulose standard (top panel) and from the neutral glucan derived 
15 from shoots of rswl plants grown at 31 °C (lower panel). The co-incident peaks show that 
the rswl glucan is 1,4-linked. 

Figure 4 is a schematic representation of the contiguous region of Arabidopsis thaliana 
chromosome 4 (stippled box) between the cosmid markers g8300 and 06455, showing the 

20 location of overlapping YAC clones (open boxes) within the contiguous region. The position 
of the RSWl locus is also indicated, approximately 1.2cM from g8300 and 0.9cM from 
06455. The scale indicates lOOkb in length. L, left-end of YAC; R, right-end of YAC. 
Above the representation of chromosome 4, the YAC fragments and cosmid clone fragments 
used to construct the contiguous region are indicated, using a prefix designation 

25 corresponding to the YAC or cosmid from which the fragments were obtained( eg yUP9E3, 
yUP20B12, etc) and a suffix designation indicating whether the fragment corresponds to the 
right-end (RE) or left-end (LE) of the YAC clone; N, North; S, South; CAPS, cleaved 
amplified polymorphic sequence (Konieczny and Ausubel, 1993) version of the g8300 
marker. 

30 
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Figure 5 is a schematic representation of a restriction map of construct 23H12 between the 
left T-DNA border (LB) and right T-DNA border (RB) sequences (top solid line), showing 
the position of the Arabidopsis thaliana RSWl locus (stippled box). The line at the top of 
the figure indicates the region of 23H12 which is contained in construct pRSWl. The 
5 structure of the RSWl gene between the translation start (ATG) and translation stop (TAG) 
codons is indicated at the bottom of the figure. Exons are indicated by filled boxes; introns 
are indicated by the solid black line. The alignment of EST clone T20782 to the 3'-end of 
the RSWl gene, from near the end of exon 7 to the end of exon 14, is also indicated at the 
bottom of the figure. Restriction sites within 23H12 are as follows: B, BamHI; E, EcoRI; 
10 H, Hindllk S, Sa/I; Sm, Smal, 

Figure 6 is a photographic representation showing complementation of the radial root 
swelling phenotype of the rswl mutant by transformation with construct 23H12. The rswl 
mutant was transformed with 23H12 as described in Example 6. Transformed rswl plants 
15 (centre group of three seedlings), untransformed rswl plants (left group of three seedlings) 
and untransformed A. thaliana Columbia plants (right group of three seedlings) were grown 
at 21 °C for 5 days and then transferred to 31 °C for a further 2 days, after which time the 
degree of root elongation and radial root swelling was determined. 

20 Figure 7 is a photographic representation comparing wild-type Arabidopsis thaliana 
Columbia plants (right-hand side of the ruler) and A, thaliana Columbia plants transformed 
with the antisense RSWl construct (i.e. EST T20782 expressed in the antisense orientation 
under control of the CaMV 35S promoter sequence; left-hand side of the ruler), showing 
inflorescence shortening at 21 °C in plants transformed with the antisense RSWl construct 

25 compared to untransformed Columbia plants. The phenotype of the antisense plants at 21 °C 
is similar to the phenotype of the- rswl mutant at 31 °C. Inflorescence height is indicated in 
millimetres. 

Figure 8 is a schematic representation showing the first 90 amino acid residues of 
30 Arabidopsis thaliana RSWl aligned to the amino acid sequences of homologous polypeptides 
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from A. thaliana and other plant species. The shaded region indicates highly conserved 
sequences. Ath-A and Ath-B are closely related Arabidopsis thaliana cDNA clones identified 
by hybridisation screening using part of the RSWl cDNA as a probe. S0542, rice EST clone 
(MAFF DNA bank, Japan); celAl and celAl, cotton cDNA sequences expressed in cotton 
5 fibre (Pear et al % 1996); SOYSTF1A and SOYSTF1B, putative soybean bZIP transcription 
factors. Amino acid designations are as indicated in Table 1 incorporated herein. Conserved 
cysteine residues are indicated by the asterisk. 

Figure 9 is a schematic representation showing the alignment of the complete amino acid 
10 sequence of Arabidopsis thaliana RSWl to the amino acid sequences of homologous 

polypeptides from A. thaliana and other plant species. The shaded region indicates highly 

conserved sequences. Ath-A and Ath-B are closely related Arabidopsis thaliana cDNA 

clones identified by hybridisation screening using part of the RSWl cDNA as a probe. 

S0542, rice EST clone (MAFF DNA bank, Japan); ce/Al, cotton genetic sequence (Pear et 
15 al y 1996); D48636, a partial cDNA clone obtained from rice (Pear et al y 1996). Amino acid 

designations are as indicated in Table 1 incorporated herein. Numbering indicates the amino 

acid position in the RSWl sequence. 

Figure 10 is a schematic representation of the RSWl polypeptide, showing the positions of 
20 putative transmembrane helices (hatched boxes), cysteine-rich region (Cys) and aspartate 
residues (D) and the QVLRW signature which are conserved between RSWl and related 
amino acid sequences. Regions of RSWl which are highly-conserved between putative 
cellulose biosynthesis polypeptides are indicated by the dark-shaded boxes, while less- 
conserved regions are indicated by the light-shaded boxes. 

25 

Figure 11 is a photographic representation of a Southern blot hybridisation of the 5'- end 
of the Arabidopsis thaliana RSWl cDNA to £g/II-digested DNA derived from A. thaliana 
(lane 1) and cotton (lane 2). Hybridisations were carried out under low stringency conditions 
at 55°C. Arrows indicate the positions of hybridising bands. 

30 
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EXAMPLE 1 

CHARACTERISATION OF THE CELLULOSE-DEFICIENT 

Arabidopsis thaliana MUTANT r$w\ 

5 1. Morphology 

The Arabidopsis thaliana rswl mutant was produced in a genetic background comprising the 
ecotype Columbia. 

The altered root cell -shape and temperature sensitivity of the root morphology of the 
10 Arabidopsis thaliana mutant rswl are disclosed, among other morphological mutants, by 
Baskin et ai (1992). 

As shown in Figure 1, the present inventors have shown that the rswl mutant exhibits the 
surprising phenotype of having reduced inflorescence height when grown at 31 °C, compared 
15 to wild-type Columbia plants grown under similar conditions. In contrast, when grown at 
21 °C, the inflorescence height of rswl is not significantly different from wild type plants 
grown under similar conditions, indicating that the shoot phenotype of rswl is conditional 
and temperature-dependent. 

20 Furthermore, cryo-scanning electron microscopy of the epidermal cells of the rswl mutant 
indicates significant abnormality in cell shape, particularly in respect of those epidermal cells 
forming the leaves, hypocotyl and cotyledons, when the seedlings are grown at 31 °C (Figure 

2). 

25 Rosettes (terminal complexes) are the putative hexameric cellulose synthase complexes of 
higher plant plasma membranes (Herth, 1985). Freeze-fractured root cells of Arabidopsis 
thaliana rswl plants grown at 18°C show cellulose microfibrils and rosettes on the PF face 
of the plasma membrane that resembles those of wild-type A. thaliana and other 
angiosperms. Transferring the rswl mutant to 31°C reduces the number of rosettes in the 

30 mutant within 30 min, leading to extensive loss after 3 hours. Plasma membrane particles 
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align in rows on prolonged exposure to the restrictive temperature. In contrast, there is no 
change in the appearance of cortical microtubules that align cellulose microfibrils, or of 
Golgi bodies that synthesise other wall polysaccharides and assemble rosettes. 

5 2. Carbohydrate content 

The effect of mutations in the RSWl gene on the synthesis of cellulose and other 
carbohydrates was assessed by measuring in vivo incorporation of U C (supplied as uniformly 
labelled glucose) into various cell wall fractions. Wild type (RSWl) and homozygous mutant 
rswl seed were germinated at 21°C on agar containing Hoagland's nutrients and 1% (w/v) 

10 unlabelled glucose. After 5 d, half of the seedlings were transferred to 31°C for 1 d while 
the remainder were maintained at 21 °C for the same time. Seedlings were covered with a 
solution containing Hoagland's nutrients and 14 C -glucose and incubated for a further 3 h at 
the same temperature. Rinsed roots and shoots were separated and frozen in liquid nitrogen. 
Tissue was homogenised in cold, 0.5 M potassium phosphate buffer (0.5M KH 2 P0 4 , pH7.0) 

1 5 and a crude cell wall fraction collected by centrifiigation at 2800 rpm. The wall fraction was 
extracted with chloroform/methanol [1:1 (v/v)] at 40°C for 1 hour, followed by a brief 
incubation at 150°C, to remove lipids. The pellet was washed successively with 2ml 
methanol, 2ml acetone and twice with 2ml of deionised water. Finally, the pellet was 
extracted successively with dimethyl sulphoxide under nitrogen to remove starch; 0.5% 

20 ammonium oxalate to remove pectins; 0.1 M KOH and 3 mg/ml NaBH 4 and then with 4 M 
KOH and 3 mg/ml NaBH 4 to extract hemicelluloses; boiling acetic acid/nitric acid/water 
[8:1:2 (v/v)], to extract any residual non-cellulosic carbohydrates and leave crystalline 
cellulose as the final insoluble pellet (Updegraph, 1969). All fractions were analysed by 
liquid scintillation counting and the counts in each fraction from the mutant were expressed 

25 as a percentage of the counts in the wild type under the same conditions. 

As shown in Table 3, mutant and wild type plants behave in quite similar fashion at 21 °C 
(the permissive temperature) whereas, at the restrictive temperature of 31 °C, the 
incorporation of ,4 C into cellulose is severely inhibited (to 36% of wild type) by the rswl 
30 mutation. The data in Table 3 indicate that cellulose synthesis is specifically inhibited in 
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the rswl mutant. The wild type RSWl gene is therefore involved quite directly in cellulose 
synthesis and changing its sequence by mutation changes the rate of synthesis. 



5 TABLE 3 



Counts in fractions from rswl plants expressed as a 
% of counts in comparable fraction from wild type plants 


Pectins 


Hemicelluloses 


Cellulose 


21 °C 


31°C 


21°C 


31°C 


21°C 


31 °C 


125 


104 


111 


101 


80 


36 



In homozygous mutant rswl plants, the pectin fraction extracted by ammonium oxalate 
contained abundant glucose, atypical of true uronic acid-rich pectins. The great majority of 
the glucose remained in the supernatant when cetyltrimethylammonium bromide precipitated 
15 the negatively charged pectins. 

3. Non-crystalline p-l»4-glucan content 

The quantity of cellulose and the quantity of a non-crystalline P-l,4-glucan recovered from the 
ammonium oxalate fraction were determined for seedlings of wild type Columbia and for 

20 backcrossed, homozygous rswl that were grown for either 7 days at 21 °C or alternatively, for 
2 days at 21 °C and 5 days at 3 1 °C, on vertical agar plates containing growth medium (Baskin 
et al. t 1992) plus 1% (w/v) glucose, and under continuous light (90 (amol m" 2 s" 1 ). Roots and 
shoots were separated from about 150 seedlings, freeze-dried to constant weight and ground 
in a mortar and pestle with 3 ml of cold 0.5 M potassium phosphate buffer (pH 7.0). The 

25 combined homogenate after two buffer rinses (2ml each) was centrifuged at 2800 x g for 10 
min. After washing the pellet fraction twice with 2 ml buffer and twice with 2 ml distilled 
water, the pellet, comprising the crude cell wall fraction, and the pooled supernatants, 
comprising the phosphate buffer fraction were retained. The crude cell wall pellet fraction was 
stirred with two 3 ml aliquots of chloroform/methanol [1:1 (v/v)] for 1 hour at 40 °C, 2 ml of 

30 methanol at 40°C for 30 min, 2 ml of acetone for 30 min, and twice with water. The whole 
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procedure repeated in the case of shoots. Combined supernatants were dried in a nitrogen 
stream. The pellet was successively extracted with: (i)3 ml of DMSO- water 9: 1 [v/v], sealed 
under nitrogen, overnight with shaking, followed by two 2ml extractions using DMSO/water 
and three 2ml water washes; (ii) 3ml of ammonium oxalate (0.5 %) at 100°C for 1 hour, 
5 followed by two water washes; (iii) 3ml of 0.1 M KOH containing lmg/ ml sodium 
borohydride, for 1 hour at 25 °C (repeated once for root material or twice for shoot material), 
with a final wash with 2 ml water; (iv) 3 ml of 4 M KOH containing 1 mg/ml sodium 
borohydride, for 1 hour at 25 °C (repeated once for root material or twice for shoot material). 
The final pellet was boiled with intermittent stirring in 3 ml of acetic acid-nitric acid-water 
1 0 [8:1 :2 (v/v)] (Updegraph 1969), combined with 2 water washes, and diluted with 5 ml water. 

The insoluble residue of cellulose was solubilised in 67% (v/v) H 2 S0 4 , shown to contain 
greater than 97% (w/v) glucose using GC/MS (Fisons AS800/MD800) of alditol acetates 
(Doares et aL, 1 991 ) and quantified in three independent samples by anthrone/H 2 S0 4 reaction. 
1 5 Results of GC/MS for pooled replica samples are presented in Table 4. 

The non-crystalline p-1 ,4-glucan was recovered as the supernatant from the ammonium oxalate 
fraction when anionic pectins were precipitated by overnight incubation at 37 °C with 2% (w/v) 
cetyltrimethylammonium bromide (CTAB) and collected by centrifugation at 2800 x g for 10 
20 min. The glucan (250 |ig/ml) or starch (Sigma; 200 jxg/ml) were digested with mixtures of 
endocellulase (EC 3.2. 1 .4; Megazyme, Australia) from Trichoderma and almond P-glucosidase 
(EC 3.2.1.21; Sigma), or Bacillus sp. oc-amylase (EC 3.2.1.1; Sigma) and rice a-glucosidase 
(EC 3.2.1.20; Sigma). 

25 The material recovered in the supernatant from the ammonium oxalate fraction was shown to 
contain a pure P-l,4-glucan by demonstrating that: (i) only glucose was detectable when it 
was hydroiysed by 2 M TFA in a sealed tube for 1 h at 120°C in an autoclave, the supernatant 
(2000 g for 5 min) was dried under vacuum at 45 °C to remove TFA and glucose was 
determined by GC/MS; (ii) methylation (Needs and 

30 Selvendran 1993) gave a dominant peak resolved by thin layer chromatography and by GC/MS 
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that was identical to that from a cellulose standard and so indicative of 1,4-linked glucan 
(Figure 3); and 

(iii) the endo-cellulase and p-l,4-g!ucosidase mixture 
released 83 % of the TFA-releasable glucose from the glucan produced by rswl at 3 1 °C while 
5 the a-amylase/a-glucosidase mixture released no glucose from the glucan. Conversely, the 
ct-amylase/ a-glucosidase mixture released 95% of the TFA-releasable glucose from a starch 
sample, while the endo-cellulase/p-1 ,4-glucosidase mixture released no glucose from starch. 

Extractability of the glucan using ammonium oxalate, and the susceptibility of the glucan to 
10 endocellulase/p-glucosidase and TFA hydrolysis indicate that the glucan in the rswl mutant 
is not crystalline, because it is the crystallinity of glucan which makes cellulose resistant to 
extraction and degradation. 

Table 4 shows the quantity of glucose in cellulose determined by the anthrone/H 2 S0 4 reaction 
1 5 and the quantity in the non-crystalline glucan after TFA hydrolysis, for shoots of wild type and 
mutant rswl Arabidopsis plants. The data indicate that the production of cellulose and of the 
non-crystalline (J-l,4-glucan can be manipulated by mutational changes in the RSW1 gene. 

TABLE 4 

20 Glucose contents of cellulose and of the ammonium oxalate-extractable glucan 





wild type 


rsw\ 


21°C 


31°C 


21°C 


31°G 


Cellulose 


273+28 


363+18* 


218+20 


159+19* 


Glucan 


22 


58 


24 


195 



All values nmol glucose mg-1 plant dry weight + sd (n=3). 



25 * Differences significant at 0.001 % level. 
4. Starch content 

The quantity of starch recovered in the DMSO fraction from roots in the experiment described 
above was also determined by the anthrone/H 2 S0 4 extraction (Table 5). 
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As shown in Table 5, the level of starch deposited in the rswl mutant is 4-fold that 
detectable in the roots of wild-type plants at the restrictive temperature of 31 °C. A similar 
rise in starch is also seen if the data are expressed as nmol glucose per plant. There is no 
detectable difference in deposition at starch between rswl plants and wild-type plants at 
5 21°C. 

TABLE 5 



Quantity of starch (nmol glucose per mg dry weight of seedling) extracted 

from roots of rswl and wild type seedlings 



Temperature 


Phenotype 


Wild-type 


rswl mutant 


21°C 


22 


18 


31°C 


37 


126 



The composition of cell walls in the rswl mutant plant compared to wild type plants at the 
15 restrictive temperature of 31 °C, is summarised in Table 6. 



TABLE 6 

Mol% composition of cell walls from shoots of rswl and wild-type 
20 seedlings grown at 31 °C 



Cell wall component 


Phenotype 


Wild-type 


rswl mutant 


Crystalline cellulose 


38.4 


16.5 


Non- crystalline 
P-l,4-glucan 


8.5 


27.1 


Pectin 


37.1 


36.3 


Alkali-soluble 


15.6 


19.8 


Acid-soluble 


0.3 


0.4 



30 
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In conclusion, the rswl mutation disassembles cellulose synthase complexes in the plasma 
membrane, reduces cellulose accumulation and causes p-l,4-glucan to accumulate in a non- 
crystalline form. 

5 

EXAMPLE 2 

MAPPING OF YAC CLONES TO THE rswl LOCUS 

The rswl locus in the mutant Arabidopsis thaliana plant described in Example 1 above was 
1 0 mapped to chromosome 4 of A. thaliana using RFLP gene mapping techniques(Chang et al y 
1988; Nam et aL, 1989) to analyse the F 2 or R progeny derived from a Columbia 
(Co)/Landsberg (Ler) cross. In particular, the rswl mutation was shown to be linked 
genetically to the ga5 locus, which is a chromosome 4 visual marker in A. thaliana. * 

1 5 Based on an analysis of map distances and chromosomal break points in 293 F 2 or F 3 
progeny derived from a Columbia (Co)/Landsberg (Ler) cross, rswl was localised to an 
approximately 2.1 cM region between the RFLP markers g8300 and 06455, approximately 
1.2cM south of the CAPS (cleaved amplified polymorphic sequence; Konieczny and 
Ausubel, 1993) version of the g8300 marker (Figure 4). 

20 

The interval between g8300 and 06455 in which rswl residues was found to be spanned by 
an overlapping set of Yeast Artificial Chromosome (YAC) clones. The clones were obtained 
from Plant Industry, Commonwealth Scientific and Industrial Research Organisation, 
Canberra, Australia. The YACs were positioned in the g8300/06455 interval by 
25 hybridisation using known DNA molecular markers (from within the interval) and DNA 
fragments from the ends of the YACs. The length of the interval was estimated to comprise 
900kbofDNA. 

Refined gene mapping of recombinants within the region spanned by YAC clones established 
30 the genetic distance between the RFLP marker g8300 and the rswl locus. 



WO 98/00549 



PCT/AU97/00402 



-50- 

The combination of genetic map distance data and the mapping of YAC clones within the 
region further localised the rswl locus to the YAC clone designated yUP5C8. 

5 EXAMPLE 3 

MAPPING OF cDNA CLONES TO THE YAC CLONE YUP5C8 

An Arabidopsis thaliana cDNA clone designated T20782 was obtained from the public 
Arabidopsis Resource Centre, Ohio State University, 1735 Neil Avenue. Columbus, OH 

10 43210, United States of America. The T20782 cDNA clone was localised broadly to the 
DNA interval on Arabidopsis chromosome 4 between the two markers g8300 and 06455 
shown in Figure 4 . Using a polymerase chain reaction (PCR) based approach DNA primers 
(5 ' - AG AAC AGC AG AT AC ACGG A-3 ' and 5 ' -CTG AAG AAGGCTGG AC AAT-3 * ) des igned 
to the T20782 cDNA nucleotide sequence were used to screen Arabidopsis YAC clone 

15 libraries. The T20782 cDNA clone was found to localise to YACs (CIC1F9, CIC10E9, 
CIC11D9) identified on the Arabidopsis chromosome 4 g8300 and 06455 interval (Figure 
4). The same approach was used to further localise clone T20782 to YAC clone yUP5C8, 
the same YAC designated to contain the rswl locus in the same chromosome interval (Figure 
4).. 

20 

Furthermore, amplification of the YAC clone yUP5C8 using primers derived from T20782 
produces a 500bp fragment containing two putative exons identical to part of the T20782 
nucleotide sequence, in addition to two intron sequences. 

25 The cDNA T20782 was considered as a candidate gene involved in cellulose biosynthesis. 
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EXAMPLE 4 

NUCLEOTIDE SEQUENCE ANALYSIS OF THE CDNA CLONE T20782 

5 The nucleotide sequence of the cDNA clone T20782 is presented in SEQ ID NO: 1. The 
nucleotide sequence was obtained using a Dye Terminator Cycle Sequencing kit (Perkin 
Elmer cat. #401384) as recommended by the manufacturer. Four template clones were used 
for nucleotide sequencing to generate the sequence listed. The first template was the cDNA 
clone T20782. This template was sequenced using the following sequencing primers: 

10 

a) 5 -C A ATGC ATTC AT AGCTCC AGCCT-3 ' 

b) 5 '-A AA AGGCTGG AGCTATGAATGCAT-3 ' 

c) 5 ' -TC ACCG AC AG ATTC ATC ATACCCG-3 ' 

d) 5'- G AC ATGG A ATC ACCTTAACTGCC-3 ' 
1 5 e)5 ' -CC ATTC AGTCTTGTCTTCGTAACC-3 ' 

f) 5 '-GGTTACGAAGACAAGACTGAAATGG-3 ' 

g) 5 ' -G AACCTC AT AGGC ATTGTGGGCTGG-3 ' 

h) 5 ' -GC AGGCTCT AT ATGGGT ATG ATCC-3 ' 

i) Standard M13 forward sequencing primer. 
20 j)Standard T7 sequencing primer. 

The second template clone (T20782 Sphl deletion clone) was constructed by creating a DNA 
deletion within the T20782 clone. The T20782 clone was digested with the restriction 
enzyme Sphl, the enzyme was heat-killed, the DNA ligated and electroporated into NM522 

25 E.coli host cells. The T20782 Sphl deletion clone was then sequenced using a standard M13 
forward sequencing primer. Two other deletion clones were made for DNA sequencing in 
a similar fashion but the restriction enzymes EcoRI and Smal were used. The T20782 
EcoRI deletion clone and the T20782 Smal deletion clone were sequenced using a standard 
T7 sequencing primer. The DNA sequence shown in SEQ ID NO:l is for one DNA strand 

30 only however those skilled in the art will be able to generate the nucleotide sequence of the 
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complementary strand from the data provided. 

The amino acid sequence encoded by clone T20782 was derived and is set forth in SEQ ID 
NO:2 

5 

The T20782 clone encodes all but the first Aspartate (D) residue of the D, D, D, QXXRW 
signature conserved in the general architecture of p-glycosyl transferases. In particular, 
T20782 encodes 5 amino acid residues of the D, D, D, QXXRW signature, between amino 
acid positions 109 and 370 of SEQ ID NO:2. The conserved Aspartate, Aspartate, 
10 Glutamine. Arginine and Tryptophan amino acid residues are shown below, in bold type, 
with the local amino acid residues also indicated: 

1. Amino acid residues 105 to 113 of SEQ ID NO:2: 
LLNVDCDHY; 

15 2. Amino acid residues 324 to 332 of SEQ ID NO:2: 

SVTEDILTG; and 
3. Amino acid residues 362 to 374 of SEQ ID NO:2: 
DRLNQ VLRW ALGS . 

20 It must be noted that these invariable amino acids merely indicate that the T20782 derived 
amino acid sequence belongs to a very broad group of glycosyl transferases. Some of these 
enzymes such as cellulose synthase, chitin synthase, alginate synthase and hyaluronic acid 
synthase produce functionally very different compounds. 

25 The presence of the conserved amino acid residues merely indicate that the T20782 clone 
may encode a P-glycosyl transferase protein such as the cellulose gene product, cellulose 
synthase. The fact that the clone localises in the vicinity of a gene involved in cellulose 
biosynthesis is the key feature which now focus interest on the T20782 clone as a candidate 
for the RSW\ (cellulose synthase) gene. 

30 
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The T20782 potentially codes for a cellulose synthase. 

EXAMPLE 5 

5 NUCLEOTIDE SEQUENCE ANALYSIS OF THE GENOMIC CLONE 23H12 

Clone 23H12 contains approximately 21kb of Arabidopsis thaliana genomic DNA in the 
region between the left border and right border T-DNA sequences, and localises to the 
RSW\ candidate YAC yUP5C8. Clone 23H12 was isolated by hybridisation using EST20782 
10 insert DNA r from a genomic DNA library made for plant transformation. Cosmid 12C4 was 
also shown to hybridize to the cDNA clone T20782, however this cosmid appears to comprise 
a partial genomic sequence corresponding to the related Ath-A cDNA sequence set forth in 
SEQ ID NO:7, for which the corresponding amino acid sequence is set forth in SEQ ID 



A restriction enzyme map of clone 23H12 is presented in Figure 5. 

Nucleotide sequence of 84 11 bp of genomic DNA in the binary cosmid clone 23H12 was 
obtained (SEQ ID NO:3) by primer walking along the 23H12 template, using a Dye 
20 Terminator Cycle Sequencing kit (Perkin Elmer cat. #401384) as recommended by the 
manufacturer. The following primers at least, were used for DNA sequencing of the 23H12 
clone DNA: 



NO:8. 



a)csl-R 



5 '-CAATGCATTCATAGCTCC AGCCT-3 ' 



25 b)csl-F 



5 ' - A AA AGGCTGG AGCTATG A ATGC AT-3 ' 



c)up 



5 ' - AG AAC AGC AG AT AC ACGG A-3 ' 



d)ve76-R2 



5 ' - ATCCGTGT ATCTGCTGTTCTT ACC-3 ' 



e)estl-R 



5 A ATGCTCTTGTTGCC AAAGC AC-3 ' 



f)sve76-F 



5 ' - ATTGTCC AGCCTTCTTC AGG-3 ' 



30 g)ve76-R 



5 '-CTGAAGAAGGCTGGAC AATGC-3 ' 



BNSOOCIO: <WO_«0064*UJU* 



WO 98/00549 



PCT/AU97/00402 



-54- 



h)B12-Rl 



5 ' - AGGTA AGC AT AGCTG AACC ATC-3 ' 



0B12-R2 



5 ' - AGTAG ATTGC AG ATGGTTTTCTAC-3 ' 



j)B12-R3 



5 ' -TTC A ATGGGTCC ACTGT ACT A AC-3 ' 



k)B12-R4 



5 '-ATTCAG ATGCACC ATTGTC-3 ' 



5 



The structure of the RSWl gene contained in cosmid clone 23H12 is also presented in Figure 
5. As shown therein, coding sequences in 23H12, from the last 12 bp of exon 7 to the end 
of exon 14, correspond to the full T20782 cDNA sequence (i.e. SEQ ID NO:l). The 
nucleotide sequences of the RSWl gene comprising exons 1 to 8 were amplified from 
10 A.thaliana Columbia double-stranded cDNA, using amplification primers upstream of the 
RSWl start site and a primer internal to the EST clone T20782. 

The exons in the RSWl gene range from 81bp to 585bp in length and all 5' and 3' 
intron/exon splice junctions conform to the conserved intron rule. 



The RSWl transcript comprises a 5' -untranslated sequence of at least 70bp in length, a 
3243bp coding region and a 360bp 3 '-untranslated region. Northern hybridization analyses 
indicate that the RSWl transcript in wild-type A. thaliana roots, leaves and inflorescences 
is approximately 4.0kb in length, and that a similar transcript size occurs in mutant tissue 
20 (data not shown). 

The derived amino acid sequence of the RSWl polypeptide encoded by the cosmid clone 
23H12 (i.e. the polypeptide set forth in SEQ ID NO:6) is 1081 amino acids in length and 
contains the entire D, D, D, QXXRW signature characteristic of p-glycosyl transferase 
25 proteins, between amino acid position 395 and amino acid position 822. The conserved 
Aspartate, Glutamine, Arginine and Tryptophan residues are shown below, in bold type, 
with the local amino acid residues also indicated: 



30 



1 . amino acid residues 391 to 399 of SEQ ID NO:6: 
YVSDDGSAM 
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2. Amino acid residues 557 to 565 of SEQ ID NO:6: 

LLNVDCDHY; 

3. Amino acid residues 776 to 784 of SEQ ID NO:6: 

SVTEDILTG; and 
5 4. Amino acid residues 814 to 826 of SEQ ID NO:6: 

DRLNQVLRWALGS. 

The second and third conserved Aspartate residues listed supra, and the fourth conserved 
amino acid sequence motif listed supra (i.e. QVLRW) are also present in the cDNA clone 
10 T20782 (see Example 4 above). 

The 23H12 clone potentially encodes a cellulose synthase. 

1 5 EXAMPLE 6 

COMPLEMENTATION OF THE rswl MUTATION 

The complementation of the cellulose mutant plant rswl is the key test to demonstrate the 
function of the clone 23H12 gene product. Complementation of the rswl phenotype was 
20 demonstrated by transforming the binary cosmid clone 23H12, or a derivative clone thereof 
encoding a functional gene product, into the Arabidopsis thaliana cellulose mutant rswl . 
Two DNA constructs (23H12 and pRSWl) were used to complement the rswl mutant plant 
line. 

25 L Construct 23H12 

Clone 23H12 is described in Example 5 and Figure 5. 

2. Construct pRSWl 

The 23H12 construct has an insert of about 21kb in length. To demonstrate that any 
30 complementation of the phenotype of the rswl mutation is the result of expression of the gene 
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which corresponds to SEQ ID NO:3, a genetic construct, designated as pRSWl , comprising 
the putative RSW1 gene with most of the surrounding DNA deleted, was produced. A 
restriction enzyme (RE) map of the RSWl gene insert in pRSWl is provided in Figure 5. 

5 To produce pRSWl , the RSW1 gene was subcloned from cosmid 23H12 and cloned into the 
binary plasmid pBIN 19. Briefly, Escherichia coli cells containing cosmid 23H 12 were grown 
in LB medium supplemented with tetracyclin (3.5 mg/L). Plasmid DNA was prepared by 
alkaline lysis and digested sequentially with restriction enzymes PvuW and Sail. Two 
co-migrating fragments of 9 kb and 1 0 kb. respectively ; were isolated as a single fraction from 

1 0 a 0.8% (w/v) agarose gel. The RSW1 gene was contained on the 10 kb PvuWISaR fragment. 
The 9 kb fragment appeared to be a PvuW cleavage product not comprising the RSW1 gene. 
The restriction fragments were ligated into pBIN19 digested with Sma\ and Sail. An aliquot 
of the ligation mix was introduced by electroporation into E.coli strain XLB1. Colonies 
resistant to kanamycin (50 mg/L) were selected and subsequently characterised by restriction 

1 5 enzyme analysis to identify those clones which contained only the 10 kb Pvull/Sall fragment 
comprising the RSW1 gene, in pBIN19. 

3. Transfer of the 23HI2 and pRSWl constructs to Agrobacterium tumefaciens 

Cosmid 23H12 was transferred to Agrobacterium by triparental mating, essentially as 
20 described by Ditta et aL (1980). Three bacterial strains as follows were mixed on solid LB 
medium without antibiotics: Strain 1 was an E. coli helper strain containing the mobilising 
plasmid pRK2013, grown to stationary phase; Strain 2 was E.coli containing cosmid 23H12, 
grown to stationary phase; and Strain 3 was an exponential-phase culture of A. tumefaciens 
strain AGL1 (Lazo et aL, 1991). The mixture was allowed to grow over night at 28°C, before 
25 an aliquot was streaked out on solid LB medium containing antibiotics (ampicillin 50 mg/L, 
rifampicin 50 mg/L, tetracyclin 3.5 mg/L) to select for transformed A. tumefaciens AGLL 
Resistant colonies appeared after 2-3 days at 28 °C and were streaked out once again on 
selective medium for further purification. Selected colonies were then subcultured in liquid LB 
medium supplemented with rifampicin (50 mg/L) and tetracyclin (3.5 mg/L) and stored at 
30 -80°C. 
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Plasmid pRSWl (initially designated as p2029) was introduced into A. tumefaciens strain 
AGL1 by electroporation. 

4. Transformation of rswl plants 

5 The rswl plant line was transformed with constructs 23H12 and pRSWl using vacuum 
infiltration essentially as described by Bechtold et ah (1993). 

5. Analysis of radial swelling in transformants 

Complementation of the radial swelling (rsw) phenotype , which is characteristic of the rswl 
10 mutant plant, was assayed by germinating transformed (i.e. Tl seed) rswl seeds obtained as 
described supra on Hoaglands plates containing 50jig/ml kanamycin. Plates containing the 
transformed seeds were incubated at 21 °C for 10-12 days. Kanamycin-resistant seedlings were 
transferred to fresh Hoaglands plates containing 50jag/ml kanamycin and incubated at 31 °C 
for 2 days . Following this incubation, the root tip was examined for a radial swelling 
15 phenotype. Under these conditions, the roots of wild-type plants do not show any radial 
swelling phenotype however, the roots of rswl plants show clear radial swelling at the root tip 
and also have a short root compared to the wild-type plants. As a consequence, determination 
of the radial swelling phenotype of the transformed plants was indicative of successful 
complementation of the rswl phenotype. 

20 

The kanamycin-resistant seedlings were maintained by further growth of seedlings at 21 °C, 
following the high temperature incubation. Once plants had recovered, the seedlings were 
transferred to soil and grown in cabinets at 21 °C (16 hr light/8 hr dark cycle). T2 seed was then 
harvested from mature individual plants. 

25 

Using the 23H12 construct for rswl transformation, a total of 262 kanamycin-resistant 
seedlings were obtained. All of these transformants were tested for complementation of the 
root radial swelling phenotype. A total of 230 seedlings showed a wild type root phenotype, 
while only 32 seedlings showed the radial swelling root phenotype characteristic of rswl 
30 plants. By way of example, Figure 6 shows the phenotypes of transformed seedlings compared 
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to untransformed wild-type and rswl seedlings, following incubation at 31 °C. As shown in 
Figure 6. there is clear complementation of the radial swelling phenotype in the transformed 
seedlings, with normal root length being exhibited by the transformed seedlings at 3 1 °C 

5 Using the pRSWl construct for transformation, a total of 140 kanamycin-resistant seedlings 
were obtained. All of the 1 1 seedlings tested for complementation of the root radial swelling 
phenotype showed a wild type root phenotype and none of the seedlings showed any signs of 
radial swelling in the roots (data not shown). 

10 6. General morphological analysts of the complemented rswl mutant line 

Further characterisation of the complemented rswl plants has shown that other morphological 
characteristics of rswl have also been restored in the transgenic lines, for example the bolt 
(inflorescence) height, and the ability of the plants to grow wild type cotyledons, leaves, 
trichomes, siliques and flowers at 31 °C (data not shown). 

15 

7. Biochemical complementation of the rswl mutant line 

T2 seed from transformations using cosmid 23H12 as described supra or alternatively, using 
the binary plasmid pBinl9 which lacks any RSW\ gene sequences, was sown on Hoagland's 
solid media containing kanamycin (50(ig/ml), incubated for 2 days at 21°C and then 
20 transferred to 31°C for 5 days. Wild-type A.thaliana Columbia plants were grown under 
similar conditions but without kanamycin in the growth medium. Kanamycin resistant T2 
seedlings which have at least one copy of the 23H12 cosmid sequence, and wild-type 
seedlings, were collected and frozen for cellulose analysis. 

25 Cellulose levels were determined as acetic-nitric acid insoluble material (Updegraph, 1969) 
for 10 lines of kanamycin-resistant T2 plants transformed with the 23H12 cosmid sequence, 
and compared to the cellulose levels in rswl mutant plants, wild-type A.thaliana Columbia 
plants and A.thaliana Columbia plants transformed with the binary plasmid pBinl9. The 
results are provided in Table 7. 

30 
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As shown in Table 7, the cellulose levels have been significantly elevated in the 
complemented rswl (T2) plants, compared to the cellulose levels measured in the rswl mutant 
parent plant. In fact, cellulose levels in the 23H12-transformed plants, expressed relative to the 
fresh weight of plant material or on a per seedling basis, are not significantly different from 
5 the cellulose levels of either wild-type Arabidopsis thaliana Columbia plants or AAhaliana 
Columbia transformed with the binary plasmid pBinl9. These data indicate that the 23H12 
cosmid is able to fully complement the cellulose -deficient phenotype of the rswl mutant. 

Homozygous T3 lines are generated to confirm the data presented in Table 7. 

10 

Furthermore, data presented in Table 7 indicate that there is no difference in the rate of 
growth of the T2 transformed rswl plants and wild-type plants at 31°C, because the fresh 
weight of such plants does not differ significantly. In contrast, the fresh weight of mutant 
rswl seedlings grown under identical conditions is only approximately 55% of the level 
15 observed in T2 lines transformed with 23H12 (range about 30% to about 80%). These data 
support the conclusion that cellulose levels have been manipulated in the complemented rswl 
(T2) plants. 

Furthermore, the rate of cellulose synthesis in 23H12-transformed plants and wild-type 
20 plants at 31 °C, as measured by l4 C incorporation is also determined. 

Furthermore, the p-l,4-glucan levels and starch levels in the 23H12 transformant lines are 
shown to be similar to the p-l,4-glucan and starch levels in wild-type plants. 

25 
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TABLE 7 

CELLULOSE LEVELS IN rswl PLANTS TRANSFORMED 

WITH COSMID CLONE 23H12 



5 





SAMPLE 


SEEDLING 


CELLULOSE 


CELLULOSE 


- 


SIZE 


FRESH 


(mg cellulose/ 


(mg cellulose/ 


PLANT LINE 


(No. of 


WEIGHT 


100 mg tissue) 


seedling) 




plants) 


(nig) 






1.2 (rswl+23H12) 


126 


2.51 


1.23 


0.031 


1.4 (^wl+23H12) 


132 


2.25 


2.50 


0.056 


2.1 (rjwl+23H12) 


126 


3.23 


1.29 


0.042 


3.1 (r*wl+23H12) 


127 


3.75 


1.23 


0.046 


3.10 


128 


3.52 


1.69 


0.060 


(r5wl+23H12) 










4.4 (r.swl+23H12) 


110 


5.14 


1.31 


0.067 


4.5 (rswl+23Hl2) 


125 


3.18 


1.26 


0.040 


5.3 (rswl+23H12) 


124 


2.77 


1.17 


0.032 


9.2 (rswl+23H12) 


125 


2.26 


1.41 


0.032 


10.8 


126 


2.4 


1.20 


0.029 


(rswl+23H12) 










Columbia/pBinl9 


106 


2.64 


1.34 


0.035 


Columbia 


178 


2.73 


1.18 


0.032 


rswl mutant 


179 


1.77 


0.84 


0.015 



25 
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EXAMPLE 7 

DETERMINATION OF THE FULL-LENGTH NUCLEOTIDE SEQUENCE 
ENCODING THE WILD-TYPE RSW1 POLYPEPTIDE 

5 Arabidopsis thaliana double-stranded cDNA and cDNA libraries were prepared using the 
CAPFINDER cDNA kit (Clontech). RNA was isolated from wild-type Columbia grown in 
sterile conditions for 21 days. 

Approximately 100*000 cDNA clones in an unamplified cDNA library were screened under 
10 standard hybridization conditions at 65 °C, using a probe comprising 32 P-labelled DNA 
amplified from double stranded cDNA. To prepare the hybridization probe, the following 
amplification primers were used: 

1 . 22 80-F : 5 ' G A ATCGGCT ACG AATTTCCC A 3' 

2. 2370-F:5 TTGGTTGCTGGATCCTACCGG 3' 

15 3. cspl-R:5'GGT TCT AAA TCT TCT TCC GTC 3' 

wherein the primer combinations were either 2280-F/cspl-R or 2370-F/cspl-R. The primer 
2280-F corresponds to nucleotide positions 2226 to 2246 in SEQ ID NO .3, upstream of the 
translation start site. The primer 2370-F corresponds to nucleotide positions 2314 to 2334 

20 in SEQ ID NO:3, encoding amino acids 7 through 13 of the RSW1 polypeptide. The primer 
cspl-R comprises nucleotide sequences complementary to nucleotides 588 to 608 of the 
T20782 clone (SEQ ID NO.l) corresponding to nucleotides 6120 to 6140 of SEQ ID NO:3. 
The hybridization probes produced are approximately 1858 nucleotides in length (2280- 
F/cspl-R primer combination) or 1946 nucleotides in length (2370-F/cspl-R primer 

25 combination). 

Five hybridizing bacteriophage clones were identified, which were plaque-purified to 
homogeneity during two successive rounds of screening. Plasmids were rescued from the 
positively-hybridizing bacteriophage clones, using the Stratagene excision protocol for the 
30 ZapExpress™ vector according to the manufacturer's instructions. Colony hybridizations 
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confirmed the identity of the clones. 

Isolated cDNA clones were sequenced by primer walking similar to the method described 
in Examples 4 and 5 supra. 

5 

A full-length wild-type RSWl nucleotide sequence was compiled from the nucleotide 
sequences of two cDNA clones. First, the 3'-end of the cDNA, encoding amino acids 453- 
1081 of RSWl, corresponded to the nucleotide sequence of the EST clone T20782 (SEQ ID 
NO:l). The remaining cDNA sequence, encoding amino acids 1-654 of RSWl, was 
10 generated by amplification of the 5'-end from cDNA, using primer 2280-F, which 
comprises nucleotide sequences approximately 50~70bp upstream of the RSWl translation 
start site in cosmid 23 H12, and primer cspl-R, which comprises nucleotide sequences 
complementary to nucleotides 588 to 608 of the T20782 clone (SEQ ID NO: 1). 

15 Several amplified clones are sequenced to show that no nucleotide errors were introduced 
by the amplification process. The 5' and 3' nucleotide sequences are spliced together to 
produce the complete RSWl open reading frame and 3 '-untranslated region provided in SEQ 
ID NO:5. 

20 Those skilled in the art will be aware that the 5'-end and 3'-end of the two incomplete 
cDNAs are spliced together to obtain a full-length cDNA clone, the nucleotide sequence of 
which is set forth in SEQ ID NO:5. 

Of the remaining cDNA clones, no isolated cDNA clone comprised a nucleotide sequence 
25 which precisely matched the nucleotide sequence of the RSWl gene present in cosmid 
23H12. However, several clones containing closely-related sequences were obtained, as 
summarised in Table 8. The nucleotide sequences of the Ath-A and Ath-B cDNAs are 
provided herein as SEQ ID Nos: 7 and 9, respectively. 
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TABLE 8 

CHARACTERISATION OF A. thaliana cDNA CLONES 



CLONE NAME 


DESCRIPTION 


LENGTH 


SEQ ID NO: 


RSWi.lA 


chimeric clone 


partial 


not provided 


RSW1A 


chimeric clone 


partial 


not provided 


Ath-A 


12C4 cDNA 


full-length 


SEQ ID NO:7 


Atft-B 


new sequence 


full-length 


SEQ ID NO: 9 


RSW4A 


identical to Ath-B 


full-length 


not provided 



10 The derived amino acid sequences encoded by the cDNAs listed in Table 8, is provided in 
Figures 8 and 9 and SEQ ID Nos: 8 and 10 herein. 

Figure 10 a schematic representation of the important features of the RSW1 polypeptide 
which are conserved within A. thaliana and between A. thaliana and other plant species. In 

15 addition to the species indicated in Figure 10, the present inventors have also identified 
maize, wheat, barley and Brassica ssp. cellulose biosynthetic genes by homology search. 
Accordingly, the present invention extends to cellulose genes and cellulose biosynthetic 
polypeptides as hereinbefore defined, derived from any plant species, including A. thaliana, 
cotton, rice, wheat, barley, maize, Eucalyptus ssp., Brassica ssp. Pinus ssp., Populus ssp., 

20 Picea ssp., hemp, jute and flax, amongst others. 

EXAMPLE 8 

ISOLATION OF FULL-LENGTH NUCLEOTIDE SEQUENCE ENCODING THE 

MUTANT RSW1 POLYPEPTIDE 

25 

Arabidopsis thaliana double-stranded cDNA and cDNA libraries were prepared using the 
CAPFINDER cDN A kit (Clontech). RNA was isolated from Arabidopsis thaliana Columbia 
rswl mutant plants grown in sterile conditions for 21 days. 

30 The full-length rswl mutant nucleotide sequence was generated by sequencing two amplified 
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DNA fragments spanning the rswl mutant gene. The 5'- end sequence of the cDNA 
(comprising the 5 '-untranslated region and exons 1-11) was amplified using the primer 
combination 2280-F/cspl-R (Example 7). The 3 '-end sequence was amplified using the 
primers EST1-F and cs3-R set forth below: 
5 1 Primer EST1-F: 5 ' A ATGCTTCTTGTTGCC AAAGC A 3' 

2. Primer cs3-R: 5 'GAC ATGGAATCACCTTAACTGCC 3' 

wherein primer EST1-F corresponds to nucleotide positions 1399-1420 of SEQ ID NO:5 
(within exon 8) and primer cs3-R is complementary to nucleotides 3335-3359 of SEQ ID 
10 NO:5 (within the 3 '-untranslated region of the wild-type transcript). 

The full-length sequence of the mutant rswl transcript is set forth herein as SEQ ID NO: 11. 

Whilst not being bound by any theory or mode of action, a single nucleotide substitution 
15 in the rswl mutant nucleotide sequence (nucleotide position 1716 in SEQ ID NO: 11), 
relative to the wild-type RSWl nucleotide sequence (nucleotide position 1646 in SEQ ID 
NO:5), resulting in Ala549 being substituted with Val549 in the mutant polypeptide, may 
contribute to the altered activity of the RSWl polypeptide at non-permissive temperatures 
such as 31 °C. Additional amino acid substitutions are also contemplated by the present 
20 invention, to alter the activity of the RSWl polypeptide, or to make the polypeptide 
temperature-sensitive. 

EXAMPLE 9 

ANTISENSE INHIBITION OF CELLULOSE PRODUCTION 
25 IN TRANSGENIC PLANTS 

1. Construction of an antisense RSWl binary vector 

One example of transgenic plants in which cellulose production is inhibited is provided by 
the expression of an antisense genetic construct therein. Antisense technology is used to 
30 target expression of a cellulose gene(s) to reduce the amount of cellulose produced by 



BNSOOCID: <WO_fl60064aA1 JU» 



WO 98/00549 



PCT7AU97/0Q4O2 



-65- 

transgenic plants. 

By way of exemplification, an antisense plant transformation construct has been engineered 
to contain the T20782 cDNA insert (or a part thereof) in the antisense orientation and in 
5 operable connection with the CaMV 35S promoter present in the binary plasmid pRD410 
(Datla et aL 1992V More particularly, the T20782 cDNA clone, which comprises the 3' -end 
of the wild-type RSWl gene, was digested with Xbal and Kpnl and cloned into the 
kanamycin-resistant derivative of pGEM3zf(-)« designated as plasmid, pJKKMf(-). The 
RSWl sequence was sub-cloned, in the antisense orientation, into the binary vector pRD410 
10 as a XbaVSacl fragment, thereby replacing the p-glucuronidase (GUS or uidA) gene. This 
allows the RSWl sequence to be transcribed in the antisense orientation under the control 
of the CaMV 35S promoter. 

The antisense RSWl binary plasmid vector was transferred to Agrobaaerium tumefaciens 
15 strain AGL1, by triparental mating and selection on rifampicin and kanamycin, as described 
by Lazo et al. (1991). The presence of the RSWl insert in transformed A. tumefaciens cells 
was confirmed by Southern hybridization analysis (Southern, 1975). The construct was 
shown to be free of deletion or rearrangements prior to transformation of plant tissues, by 
back-transformation into Escherichia coli strain JM101 and restriction digestion analysis. 

20 

2. Transformation of Arabidopsis thaliana 

Eight pots, each containing approximately 16 A. thaliana ecotype Columbia plants, were 
grown under standard conditions. Plant tissue was transformed with the antisense RSWl 
binary plasmid by vacuum infiltration as described by Bechtold et al (1993). Infiltration 
25 media contained 2.5% (w/v) sucrose and plants were infiltrated for 2 min until a vacuum of 
approximately 400mm Hg was obtained. The vacuum connection was shut off and plants 
allowed to sit under vacuum for 5 min. 

Approximately 34,000 Tl seed was screened on MS plates containing 50(ig/ml kanamycin, 
30 to select for plants containing the antisense RSWl construct. Of the Tl seed sown, 135 
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kanam vein-resistant seedlings were identified, of which 91 were transferred into soil and 
grown at 21 °C under a long-day photoperiod (I6hr light; 8hr dark). 

Of the 91 transgenic lines, 19 lines were chosen for further analysis which had anther 
5 filaments in each flower which were too short to deposit pollen upon the stigma and, as a 
consequence, required hand-pollination to obtain T2 seed therefrom. 

T2 seed from 14 of these 19 lines was plated out onto vertical Hoaglands plates containing 
kanamycin to determine segregation ratios. Between five and ten seed were plated per 

10 transgenic line. Control seeds, including A. thaliana Columbia containing the binary vector 
pBIN19 (Bevan, 1984) and segregating 3:1 for kanamycin resistance, and the rsw\ mutant 
transformed with the NPTll gene, also segregating 3:1 for kanamycin resistance, were grown 
under the same conditions. Kanamycin-resistant plants were transferred to soil and grown at 
21 °C under long days, until flowering. Untransformed Arabidopsis thaliana Columbia plants 

1 5 were also grown under similar conditions, in the absence of kanamycin. 

3. Morphology of antisense- plants 

A comparison of the morphology of antisense RSW\ plants grown at 21 °C, to mutant rsw\ 
plants grown at the non-permissive temperature (i.e. 3 1 °C) has identified a number of common 
20 phenotypes. For example, the antisense plants exhibit reduced fertility, inflorescence 
shortening and have short anthers, compared to wild-type plants, when grown at 21 °C. These 
phenotypes are also observed in mutant rsw\ plants grown at 3 1 °C. These results suggest that 
the antisense construct in the transgenic plants may be targeting the expression of the wild-type 
RSW\ gene at21°C. 

25 

Figure 7 shows the reduced inflorescence (bolt) height in antisense 35S-/?W1 plants compared 
to wild-type A. thaliana Columbia plants grown under identical conditions. 

4. Cell wall carbohydrate analysis of antisense plants. 

30 T3 plants which are homozygous for the 35S-RSW1 antisense construct are generated and the 
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content of cellulose therein is determined as described in Example 1. Plants expressing the 
antisense construct are shown to have significantly less cellulose in their cell walls, compared 
to wild-type plants. Additionally, the levels of non-crystalline P-l,4-glucan and starch are 
elevated in the cells of antisense plants, compared to otherwise isogenic plant lines which have 
5 not been transformed with the antisense genetic construct. 

5. Antisense 35S-RSW1 mRNA expression levels in transgenic plants 

Total RNA was extracted from 0.2g of leaf tissue derived from 33 kanamycin-resistant Tl 
plants containing the antisense 35S-RSW] genetic construct, essentially according to 

1 0 Longemann et aL ( 1 986). Total RNA (25 p.g) was separated on a 2.2M formaldehyde/agarose 
geh blotted onto nylon filters and hybridized to a riboprobe comprising the sense strand 
sequence of the cDNA clone T20782. To produce the riboprobe, T7 RNA polymerase was 
used to transcribe sense RNA from a linearised plasmid template containing T20782, in the 
presence of [a- 32 P]UTP. Hybridizations and subsequent washes were performed as described 

15 by Dolferus et al (1994). Hybridized membranes were exposed to Phosphor screens 
(Molecular Dynamics, USA). 

The levels of expression of the RSW1 antisense transcript were determined and compared to 
the level of fertility observed for the plant lines. As shown in Table 9, the level of antisense 

20 gene expression is correlated with the reduced fertility phenotype of the antisense plants. In 
13 lines, a very high or high level of expression of the 35S-RSWI antisense gene was observed 
and, in 1 1 of these lines fertility was reduced. Only lines 2W and 3E which expressed high to 
very high levels of antisense mRNA, appeared to be fully fertile. In 12 lines which expressed 
medium levels of antisense mRNA, approximately one-half were fertile and one-half appeared 

25 to exhibit reduced fertility. In contrast, in 8 plant lines in which only a low or very low level 
of expression of the antisense 35S-RSW1 genetic construct was observed, a wild-type (i.e. 
fertile) phenotype was observed for all but one transgenic line, line 2R, 

Data presented in Table 9 and Figure 7 indicate that the phenotype of the cellulose-deficient 
30 mutant rsw] may be reproduced by expressing antisense RSW\ genetic constructs in transgenic 
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plants. 

To confirm reduced cellulose synthesis and/or deposition in transgenic plants expressing the 
antisense RSWl gene, the level of cellulose is measured by the l4 C incorporation assay or 
5 as acetic/nitric acid insoluble material as described in Example 1 and compared to cellulose 
production in otherwise isogenic wild-type plants. Cellulose production in the transgenic 
plants is shown to be significantly reduced compared to wild-type plants. The severity of 
phenotype of the transgenic plants thus produced varies considerably, depending to some 
extent upon the level of inhibition of cellulose biosynthesis. 

10 

TABLE 9 
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LEVELS OF ANTISENSE GENE EXPRESSION AND FERTILITY IN 

Tl LINES OF ANTISENSE 35S-RSm PLANTS 



Tl 


ANTISENSE 




Tl 


ANTISENSE 




PLANT 


3SS-RSW1 


FERTILITY 


PLANT 


35S-RSW1 


FERTILITY 


LINE 


EXPRESSION 




LINE 


EXPRESSION 




B 


very high 


sterile* 


2H 


medium 


fertile 


2B 


very high 


sterile* 


C 


medium 


sterile* 


3E 


very high 


fertile 


F 


medium 


sterile* 


2E 


high 


sterile* 


2Q 


medium 


fertile 


2K 


high 


sterile* 


3P 


medium 


sterile* 


2M 


high 


sterile* 


3T 


medium 


fertile 


20 


high 


sterile* 


5D 


medium 


sterile* 


2P 


high 


sterile* 


6A 


medium 


fertile 


2W 


high 


fertile 


8E 


low 


fertile 


2Z 


high 


sterile* 


2R 


low 


sterile* 


3G 


high 


sterile* 


7A 


low 


fertile 


3Q 


high 


sterile* 


7S 


low 


fertile 


7Q 


high 


sterile* 


70 


low 


fertile 


7N 


medium 


sterile* 


7R 


low 


fertile 


7G 


medium 


fertile 


IB 


very low 


fertile 


1C 


medium 


sterile* 


2U 


very low 


fertile 


2X 


medium 


sterile* 









*sterile phenotype not indicative of complete sterility, but that hand pollination at least, is 



25 required to obtain seed from such plants. 
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EX AMPLE 10 
RSWl RELATED SEQUENCES IN RICE PLANTS 

To identify RSWl related nucleotide sequences in rice, a genetic sequence database was 
5 searched for nucleotide sequences which were closely-related to one or more of the 
Arabidopsis thaliana RSW\ nucleotide sequences described in the preceding Examples. Rice 
EST S0542 (MAFF DNA bank, Japan) was identified, for which only a partial nucleotide 
sequences was available. Additionally, before the instant invention, there was no probable 
function attached to the rice EST S0542 sequence. 

10 

The present inventors have obtained the complete nucleotide sequence of clone S0542 and 
derived the amino acid sequence encoded therefor. The S0542 cDNA is only 1741 bp in length 
and appears to be a partial cDNA clone because, although it comprises lOObp of 5'- 
untranslated sequence and contains the ATG start codon, it is truncated at 3 '-end and, as a 
1 5 consequence encodes only the first 547 amino acid residues of the rice RSWl or RSWl -like 
polypeptide. Based upon the length of the corresponding Arabidopsis thaliana RSWl 
polypeptide (1081 amino acids), the rice RSWl sequence set forth in SEQ ID NO: 14 appears 
to contain approximately one-half of the complete amino acid sequence. 

20 The N-terminal half of the rice RSWl amino acid sequence is approximately 70% identical to 
Tht Arabidopsis thaliana RSWl polypeptide set forth in SEQ ID NO:6, with higher homology 
(approximately 90%) occurring between amino acid residues 271-547 of the rice sequence. 
These data strongly suggest that S0542 is the rice homologue of the A. thaliana RSWl gene. 
Alignments of rice, A. thaliana and cotton RSWl amino acid sequences are presented in 

25 Figures 9 and 10. 

To isolate full-length cDNA clones and genomic clone equivalents of S0542 (this study and 
MAFF DNA bank, Japan) or D48636 (Pear et aL. 1996), cDNA and genomic clone libraries 
are produced using rice mRNA and genomic DNA respectively, and screened by hybridisation 
30 using the S0542 or D48636 cDNAs as a probe, essentially as described herein. Positive- 
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hybridising plaques are identified and plaque-purified, during further rounds of screening by 
hybridisation, to single plaques. 

The rice clones are sequenced as described in the preceding Examples to determine the 
5 complete nucleotide sequences of the rice RSWl genes and derived amino acid sequences 
therefor. Those skilled in the art will be aware that such gene sequences are useful for the 
production of transgenic plants, in particular transgenic cereal plants having altered cellulose 
content and/or quality, using standard techniques. The present invention extends to all such 
genetic sequences and applications therefor. 

EXAMPLE 11 
RSWl RELATED SEQUENCES IN COTTON PLANTS 

1 5 A 32 P-labelled RSWl PCR fragment was used to screen approximately 200,000 cDNA clones 
in a cotton fibre cDNA library. The RSWl PCR probe was initially amplified from Arabidopsis 
thaliana wild type cDNA using the primers 2280-F and cspl-R described in the preceding 
Examples, and then re-amplified using the primer combination 2370-F/cspl -R, also described 
in the preceding Examples. 

20 

Hybridisations were carried out under low stringency conditions at 55 °C, 

Six putative positive-hybridising plaques were identified in the first screening round. Using 
two further rounds of screening by hybridisation, four of these plaques were purified to single 
25 plaques. Three plaques hybridise very strongly to the RSWl probe while the fourth plaque 
hybridises less intensely. 



We conclude that the positive-hybridising plaques which have been purified are strong 
candidates for comprising cotton RSWl gene sequences or RSWl-like gene sequences. 
30 Furthermore, the cotton cDNAs may encode the catalytic subunit of cellulose synthase, 
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because the subunit protein architecture of cellulose synthase appears to be highly conserved 
among plants as highlighted in the preceding Example. 

Furthermore, a Southern blot of cotton genomic DNA digested with BgRl was hybridised with 
5 the 5' end of the RSWl cDNA, under low stringency hybridisation conditions at 55°C. Results 
are presented in Figure 1 1 . These data demonstrate that RSWl -related sequences exist in the 
cotton genome. 

The cotton cDNA clones described herein are sequenced as described in the preceding 
1 0 Examples and used to produce transgenic cotton plants having altered fibre characteristics. The 
cDNAs are also used to genetically alter the cellulose content and/or quality of other plants, 
using standard techniques. 

EXAMPLE 12 

1 5 RSWl RELATED SEQUENCES IN EUCALYPTUS SSP. 

Putative Eucalyptus ssp. cellulose synthase catalytic subunit gene fragments were obtained by 
amplification using PCR. DNA primers were designed to conserved amino acid residues found 
in the Arabidopsis thaliana RSWl and 12C4 amino acid sequences. Three primers were used 
20 for PCR. The primers are listed below: 

pcsF-I 5'- A A/G AAGATIG A C/T T A C/T C/T T I A A A/G GAC/TA A-3' 
pcsR-II S'-A TIGT1GGIGTIC G/T A/G T T C/T T G A/T/G/C C T/G A/T/C/G C C -3' 
pcsF-II 5'- G C I A T G A A A/G A/C G I G A I T A C/T G A A/G G A -3* 

25 

Using standard PCR conditions (50°C annealing temperature) and solutions, the primer sets 
pcsF-I/pcsR-II and pcsF-II/pcsR-II were used to amplify genetic sequences from pooled 
Eucalyptus ssp. cDNA. In the first reaction primers pcsF-I and pcsR-II were used to generate 
a fragment approximately 700 bp in length. In the second PCR reaction, which used primers 
30 pcsF-II and pcsR-II, a fragment estimated to 700 bp was obtained. The sizes of the PCR 
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fragments are within the size range estimated for the corresponding Arabidopsis (haliana 
sequences. 

We conclude that the amplified Eucalyptus ssp. PGR fragments are likely to be related to the 
5 Arabidopsis (haliana RSW\ gene and may encode at least a part of the Eucalyptus ssp. 
cellulose synthase catalytic subunit. 

The Eucalyptus ssp. PCR clones described herein are sequenced as described in the preceding 
Examples and used to isolate the corresponding full-length Eucalyptus ssp cDNAs and 
1 0 genomic gene equivalents. Those skilled in the art will be aware that such gene sequences are 
useful for the production of transgenic plants, in particular transgenic Eucalyptus ssp plants 
having altered cellulose content and/or quality, using standard techniques. The present 
invention extends to all such genetic sequences and applications therefor. 

15 

EXAMPLE 13 

NON-CRYSTALLINE B-1,4-GLUCAN AS A MODIFIER 

OF CELL WALL PROPERTIES 

20 The properties of plant cell walls depend on the carbohydrates, proteins and other polymers 
of which they are composed and the complex ways in which they interact. Increasing the 
quantities of non-crystalline p-l,4-glucan in cell walls affects those wall properties which 
influence mechanical, nutritional and many other qualities as well as having secondary 
consequences resulting from the diversion of carbon into non-crystalline glucan at the expense 

25 of other uses. To illustrate one of these effects, we investigated the ability of the non- 
crystalline glucan to hydrogen bond to other wall components particularly cellulose in the way 
that has been shown to be important for wall mechanics. 

Hemicelluioses such as xyloglucans cross-link cellulose microfibrils by hydrogen bonding to 
30 the microfibril surface (Levy et ah, 1991). Since the (5-1,4-glucan backbone of xyloglucan is 
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thought to be responsible for hydrogen bonding (with the xylose, galactose and fucose 
substitutions limiting the capacity to form further hydrogen bonds) we can expect the non- 
crystalline P~K4-glucan also to have a capacity to hydrogen bond and cross link cellulose. The 
effectiveness of strong alkalis in extracting xyloglucans is thought to relate to their disruption 
5 of the hydrogen bonds with cellulose (Hayashi and MacLachlan, 1984). 

To demonstrate that the non-crystalline P-l,4-glucan forms similar associations with the 
cellulose microfibrils, we examined whether the 4 M KOH fraction, extracted from shoots of 
the rswl mutant and from wild type RSW\ plants, contained non-crystalline glucan in addition 
10 to xyloglucan. The non-crystalline glucan was separated from xyloglucan in the 4 M KOH 
extract by dialysing the neutralised extract against distilled water and centrifuging at 14000 g 
for 1 hour. The pellet was shown to be a pure 0- 1 ,4-glucan by using the methods for 
monosaccharide analysis, methylation analysis and enzyme digestion used to characterise the 
glucan in the ammonium oxalate fraction (see Example 1). 

15 

Table 10 shows the presence of substantial quantities of glucan recovered in pure form in the 
pellet from 4 M KOH fractions extracted from the overproducing rswl mutant of Arabidopsis 
thaliana. These data also demonstrate the presence of smaller quantities of non-crystalline p- 
1,4-glucan in the 4 M KOH fraction from wild type plants, compared to rswl, particularly 
20 when grown at 3 1 °C. 

TABLE 10 



Glucose contents* of 4M KOH fractions from shoots of wild-type and 

rswlmutant Arabidopsis thaliana plants 



Glucose fraction 


wild-type 


rswl mutant 


21°C 


31°C 


21°C 


31 °C 


xyloglucan and non-crystalline 
glucan in whole extract 


36.4 


56.9 


27.1 


93.1 


non-crystalline glucan in pellet 


7.8 


20.5 


7.6 


56.0 



*, nmol glucose/ mg plant dry weight after TFA hydrolysis 
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The monosaccharide composition of the supernatant remaining after centrifugation was 
determined after TFA hydrolysis. These data, and data from methylation analysis, are 
consistent with the supernatant being a relatively pure xyloglucan. The supernatant was free 
of glucan, because no glucose could be released by the endocellulase/p-glucosidase mixture 
5 that released glucose from p-1 ,4-glucan. 

The presence of both non-crystalline P-l,4-gIucan and xyloglucan in the 4 M KOH fraction, 
when taken together with the implications from structural predictions (Levy et ah 1991), is 
consistent with some of the non-crystalline p-l,4-glucan in the wall hydrogen bonding to 
10 cellulose microfibrils in similar fashion to the P-l,4-glucan backbone of xyloglucan. 

The cross linking provided when xyloglucans and other hemicelluloses bind to two or more 
microfibrils is an important determinant of the mechanical properties of cellulosic walls 
(Hayashi, 1989). The effects of increasing the amounts of non-crystalline P-l,4-glucan in walls 
1 5 are likely to be greatest in walls which otherwise possess relatively low levels of cross linking 
as a result of high ratios of cellulose: hemicelluloses. Such conditions are common in 
secondary walls including those of various fibres, and the cellulose;hemicellulose ratio is 
particularly high in cotton fibres. 

20 The effects on wall mechanical properties of overproducing non-crystalline glucan are shown 
by transforming plants with the mutant allele of rsw] (SEQ ID NO:l 1) operably under the 
control of either the RSW\ promoter derived from SEQ ID NO:3 or SEQ ID NO:4 or 
alternatively, an appropriate constitutive promoter such as the CaMV 35S promoter. 
Production of non-crystalline glucan is quantified by fractionating the cell walls using the 

25 methods described above to show in particular that non-crystalline glucan is recovered in the 
4 M KOH fraction. Mechanical properties of the cell walls are measured using standard 
methods for fibre analysis to study parameters such as stress-strain curves, and breaking strain, 
amongst other properties. 

30 
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EXAMPLE 14 
OVER-EXPRESSION OF CELLULOSE SYNTHASE 

IN TRANSGENIC PLANTS 

5 Three strategies are employed to over-express cellulose synthase in Arabidopsis thaliana 
plants. 

In the first strategy, the CaMV 35S promoter sequence is operably connected to the full-length 
cellulose synthase cDNA which is obtainable by primer extension of SEQ ID NO:l. This is 
10 achievable by cloning the full-length cDNA encoding cellulose synthase, in the sense 
orientation, between the CaMV 35S promoter or other suitable promoter operable in plants and 
the nopaline synthase terminator sequences of the binary plasmid pBI121 . 

In the second strategy, the coding part of the genomic gene is cloned, in the sense orientation, 
1 5 between the CaMV 35S promoter and the nopaline synthase terminator sequences of the binary 
plasmid pBI121. 

In the third strategy, the 23H12 binary cosmid clone or the derivative pRSWl, containing the 
cellulose synthase gene sequence operably under the control of the cellulose synthase gene 
20 promoter and terminator sequences is prepared in a form suitable for transformation of plant 
tissue. 

For Agrobacterium-medialed tissue transformation, binary plasmid constructs discussed supra 
are transformed into Agrobacterium tumefaciens strain AGL1 or other suitable strain. The 
25 recombinant DNA constructs are then introduced into wild type Arabidopsis thaliana plants 
(Columbia ecotype), as described in the preceding Examples. 

Alternatively, plant tissue is directly transformed using the vacuum infiltration method 
described by Beshtold et al (1993). 

30 
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The transgenic plants thus produced exhibit a range of phenotypes, partly because of position 
effects and variable levels of expression of the cellulose synthase transgene. 

Cellulose content in the transgenic plants and isogenic untransformed control plants is 
5 determined by the M C incorporation assay or as acetic/nitric acid insoluble material as 
described in Example 1. In general, the level of cellulose deposition and rates of cellulose 
biosynthesis in the transgenic plants are significantly greater than for untransformed control 
plants. 

1 0 Furthermore, in some cases, co-supression leads to mimicry of the rsw\ mutant phenotype. 



EXAMPLE 15 

SITE-DIRECTED MUTAGENESIS OF THE RSWl GENE 

15 

The nucleotide sequence of the RSWl gene contained in 23H12 is mutated using site-directed 
mutagenesis, at several positions to alter its catalytic activity or substrate affinity or glucan 
properties. In one example, the RSWl gene is mutated to comprise one or more mutations 
present in the mutant rswl allele. 

20 

The mutated genetic sequences are cloned into binary plasmid described in the preceding 
Examples, in place of the wild-type sequences. Plant tissue obtained from both wild-type 
Arabidopsis thaliana (Columbia) plants and A. thaliana rsw\ plants is transformed as 
described herein and whole plants are regenerated. 

25 

Control transformations are performed using the wild-type cellulose synthase gene sequence. 
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EXAMPLE 16 

PHENOTYPES OF PLANTS EXPRESSING MUTATED RSW\ GENES 

Plants transformed with genetic constructs described in Example 15 (and elsewhere) are 
5 categorised initially on the basis of number of transgene copies, to eliminate variability arising 
therefrom. Plants expressing single copies of different transgenes are analysed further for cell 
wall components, including cellulose, non-crystalline P-l,4-glucan polymer, starch and 
carbohydrate content. 

1 0 1. Cellulose content 

Cellulose content in the transgenic plants is determined by the 14 C incorporation assay as 
described in Example 1. Cell walls are prepared, fractionated and the monosaccharide 
composition of individual fractions determined as in Example 1 . 

1 S 2, Non-crystalline p-l,4-glucan content 

Transgenic plants expressing the rsw\ mutant allele exhibit a higher level of non-crystalline, 
and therefore extractable, p-l,4-glucan in cell walls compared to plants expressing an 
additional copy of the wild-type RSWX allele. Thus, it is possible to change the crystallinity 
of the P-l,4-glucan chains present in the cell wall by mutation of the wild-type RSW\ allele, 

20 

3. Starch content 

Transgenic plants are also analysed to determine the effect of mutagenesis of the RSW\ gene 
on the level of starch deposited in their roots. The quantity of starch present in material 
prepared from the crude wall fraction is determined using the anthrone/H 2 S0 4 method 
25 described in Example 1 . The data show that mutating the RSW\ gene to the mutant rsw\ allele 
increases starch deposition. This demonstrates that the gene can be used to alter the 
partitioning of carbon into carbohydrates other than cellulose. 

4. Cell wall composition 

30 The cell wall composition of transgenic plant material is also analysed. Wild type and rswl 
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and transgenic seedlings are grown for 2 d at 21 °C and then kept for a further 5 d at either 
21°C or 31°C With transfer to 31°C when the seed has scarcely germinated, the wall 
composition at final harvest largely reflects the operation of the mutated rs\v\ gene product at 
its restrictive temperature. Cell wall fractionation is carried out in similar fashion to that 
5 described for the M C-experiment (Example 1 ) and the monosaccharide composition of each 
fraction is quantified by GC/MS after hydrolysis with trifluoroacetic acid or. in the case of 
crystalline cellulose. H 2 S0 4 . 

In some transgenic plants in which the RSJVl gene is mutated, the monosaccharide 
10 composition is comparable to that observed for homozygous rsw\ plants, at least in some 
cases, confirming that there is a major reduction in the quantity of crystalline cellulose in the 
final, acid insoluble fraction. Thus, mutation of the RSW\ gene can be performed to produce 
changes in the composition of plant cell walls. 

15 EXAMPLE 17 

CHEMICAL MODIFICATION OF THE RSJVl GENE TO MANIPULATE 
CELLULOSE PRODUCTION AND PLANT CELL WALL CONTENT. 



As demonstrated in the preceding Examples, the RSW\ gene is involved in cellulose 
20 production and the manipulation of cell wall content. 

In the present Example, to identify novel phenotypes and gene sequences important for the 
normal fiinctioning of the cellulose synthase gene, the RSW1 gene is modified in planla, using 
the chemical mutagen EMS. The mutant plants are identified following germination and the 
25 modified RSW\ genes are isolated and characterised at the nucleotide sequence level. A 
sequence comparison between the mutant gene sequences and the wild type sequence reveals 
nucleotides which encode amino acids important to the normal catalytic activity of the 
cellulose synthase enzyme, at least in Arabidopsis thaliana plants. 

30 This approach thus generates further gene sequences of utility in the modification of cellulose 
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content and properties in plants. 



EXAMPLE 18 
DISCUSSION 

5 

Five pieces of evidence make a compelling case that the RSW1 gene product encodes the 
catalytic subunit of cellulose synthase: 

1 . The rswl mutation selectively inhibits cellulose synthesis and promotes accumulation 
of a non-crystalline p-l,4-glucan; 
10 2. The rswl mutation removes cellulose synthase complexes from the plasma membrane, 
providing a plausible mechanism for reduced cellulose accumulation and placing the RSW1 
product either in the complexes or interacting with them; 

3. The D,D,D,QXXRW signature identifies the RSW1 gene product as a processive 
glycosyl transferase enzyme (Saxena, 1995); 
15 4. The wild type allele corrects the temperature sensitive phenotype of the rswl mutant; 
and 

5. Antisense expression of the RSWl in transgenic plants grown at 2 1 °C reproduces some 
of the phenotype of rswl which is observed following growth at 3 1 °C. 

20 Consistent with the plasma membrane location expected for a catalytic subunit, the putative 
122 kDa RSWl product contains 8 predicted membrane-spanning regions. Six of these regions 
cluster near the C-terminus (Figure 10), separated from the other two by a domain that is 
probably cytoplasmic and has the weak sequence similarities to prokaryotic glycosyl 
transferases (Wong, 1990; Saxena, 1990 ; Matthyse, 1995; Sofia, 1994 ; Kutish, 1996). 

25 

RSWl therefore qualifies as a member of the large family of Arabidopsis thaliana genes 
whose members show weak similarities to bacterial cellulose synthase. RSWl is the first 
member of that family to be rigorously identified as an authentic cellulose synthase. Among 
the diverse genes in^. thaliana, at least two genes show very strong sequence similarities to 
30 the RSWl gene and are most likely members of a highly conserved sub-family involved in 



BN8DOCCD: <WD_j88006«A1JL> 



WO 98/00549 



PCT7AU97/00402 



-81 - 

cellulose synthesis. The closely related sequences come from cosmid 12C4, a partial genomic 
clone cross-hybridising with EST T20782 designated Ath-A, and from a full length cDNA 
designated Ath-B. 

5 Ath-A resembles RSW\ (SEQ ID NO:5) at its N-terminus whereas Ath-B starts 22 amino acid 
residues downstream [Figure 8 and Figure 9(i), (ii) and (iii)]. Closely related sequences in 
other angiosperms are the rice EST S0542 [Figure 9(i), (ii) and (iii)], which resembles the 
polypeptides encoded by RSWl and Ath-A and the cotton celAl gene (Pear, 1996) at the 
N-terminus. 

10 

The Arabidopsis thaliana, rice and cotton genes have regions of very high sequence similarity 
interspersed with variable regions (Figures 9 and 10). Most of the highest conservation among 
those gene products occurs in their central cytoplasmic domain where the weak similarities 
to the bacterial cellulose synthase occur. The N-terminal region that precedes the first 

15 membrane spanning region is probably also cytoplasmic but shows many amino acid 
substitutions as well as sequences in RSWl that have no counterpart in some of the other 
genes as already noted for eel A. An exception to this is a region comprising 7 cysteine 
residues with highly conserved spacings (Figure 10). This is reminiscent of regions suggested 
to mediate protein-protein and protein-lipid interactions in diverse proteins including 

20 transcriptional regulators and may account for the striking sequence similarity between this 
region of RSWl and two putative soybean bZIP transcription factors (Genbank SOYSTF1 A 
and IB). 

In conclusion, the chemical and ultrastructural changes seen in the cellulose-deficient mutant 
25 combine with gene cloning and complementation of the mutant to provide strong evidence that 
the RSWl locus encodes the catalytic subunit of cellulose synthase. Accumulation of 
non-crystalline p~ 1,4-glucan in the shoot of the rswl mutant suggests that properties affected 
by the mutation are required for glucan chains to assemble into microfibrils. Whilst not being 
bound by any theory or mode of action, a key property may be the aggregation of catalytic 
30 subunits into plasma membrane rosettes. At the restrictive temperature, mutant synthase 



WO 98/00549 



PCT/AU97/00402 



-82- 

complexes disassemble to monomers (or smaller oligomers) that are undetectable by freeze 
etching. At least in the shoot, the monomers seem to remain biosynthetically active but their 
p-l,4-glucan products fail to crystallise into microfibrils probably because the chains are 
growing from dispersed sites. Crystallisation into microfibrils, with all its consequences for 
5 wall mechanics and morphogenesis, therefore may depend upon catalytic subunits remaining 
aggregated as plasma membrane rosettes. 

Those skilled in the art will appreciate that the invention described herein is susceptible to 
variations and modifications other than those specifically described. It is to be understood that 
10 the invention includes all such variations and modifications. The invention also includes all 
of the steps, features, compositions and compounds referred to or indicated in this 
specification, individually or collectively, and any and all combinations or any two or more 
of said steps or features. 
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10 



\C) TELEX : AA3178*? 
(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2248 base pairs 

(B) TYPE; nucleic acid 
(C> STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii> HYPOTHETICAL: NO 

15 (vi) ORIGINAL SOURCE : 

(A) ORGANISM: Arabidopsis thaliana 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: EST T20782 

20 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1 . . 1887 

25 

(xiJ SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CGA GCT ATG AAG AGA GAG TAT GAA GAG TTT AAA GTG AGG ATA AAT GCT 4 8 

Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala 
30 l S 10 15 

CTT GTT GCC AAA GCA CAG AAA ATC CCT GGA GAA GGC TGG ACA ATG CAG 96 

Leu Val Ala Lys Ala Gin Lys He Pro Gly Glu Gly Trp Thr Met Gin 

20 25 30 

35 

GAT GGT ACT CCC TGG CCT GGT AAC AAC ACT AGA GAT CAT CCT GGA ATG 14 4 

Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Arg Asp His Pro Gly Met 
35 40 45 
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ATA CAG GTG TTC TTA GGC CAT AGT GGG GGT CTG GAT ACC GAT GGA AAT 192 
lie Gin Val Phe Leu Gly His Ser Gly Gly Leu Asp Thr Asp Gly Asn 
SO 55 60 

5 GAG CTG CCT AG A CTC ATC TAT GTT TCT CGT GAA AAG CGG CCT GGA TTT 24 0 

Glu Leu Pro Arg Leu lie Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe 
65 70 75 80 

CAA CAC CAC AAA AAG GCT GGA GCT ATG AAT GCA TCG ATC CGT GTA TCT 2 88 

10 Gin His His Lys Lys Ala Gly Ala Met Asn Ala Ser He Arg Val Ser 

85 90 95 

GCT GTT CTT ACC AAT GGA GCA TAT CTT TTG AAC GTG GAT TGT GAT CAT 336 
Ala Val Leu Thr Asn Gly Ala Tyr Leu Leu Asn Val Asp Cys Asp His 
15 100 105 110 

TAC TTT AAT AAC AGT AAG GCT ATT AAA GAA GCT ATG TGT TTC ATG ATG 3 84 

Tyr Phe Asn Asn Ser Lys Ala He Lys Glu Ala Met Cys Phe Met Met 
115 120 125 

20 

GAC CCG GCT ATT GGA AAG AAG TGC TGC TAT GTC CAG TTC CCT CAA CGT 43 2 

Asp Pro Ala He Gly Lys Lys Cys Cys Tyr Val Gin Phe Pro Gin Arg 

130 135 140 

25 TTT GAC GGT ATT GAT TTG CAC GAT CGA TAT GCC AAC AGG AAT ATA GTC 4 80 

Phe Asp Gly He Asp Leu His Asp Arg Tyr Ala Asn Arg Asn He Val 
145 150 155 160 

TTT TTC GAT ATT AAC ATG AAG GGG TTG GAT GGT ATC CAC GGT CCA GTA 52 8 

30 Phe Phe Asp He Asn Met Lys Gly Leu Asp Gly He His Gly Pro Val 

165 170 175 

TAT GTG GGT ACT GGT TGT TGT TTT AAT AGG CAG GCT CTA TAT GGG TAT 576 
Tyr Val Gly Thr Gly Cys Cys Phe Asn Arg Gin Ala Leu Tyr Gly Tyr 
35 180 185 190 

GAT CCT GTT TTG ACG GAA GAA GAT TTA GAA CCA AAT ATT ATT GTC AAG 6 24 

Asp Pro Val Leu Thr Glu Glu Asp Leu Glu Pro Asn He He Val Lys 
195 200 205 

40 
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AGC TGT TGC 
Ser Cys Cys 
210 

5 TAC GAA AAG 
Tyr Glu Lys 

225 

TTC AAT ATG 
10 Phe Asn Met 

AGG TCT ATT 
Arg Ser He 

15 

TCG CCG GTA 
Ser Pro Val 
275 

20 

CCA ACA ACC 
Pro Thr Thr 
290 

25 AGC TGT GGT 
Ser Cys Gly 
305 

ATC TAT GGT 
30 He Tyr Gly 

GCC CGG GGT 
Ala Arg Gly 

35 

AAG GGA TCT 
Lys Gly Ser 
3 55 

40 



GGG TCA AGG AAG 
Gly Ser Arg Lys 

215 

AGG AGA GGC ATC 
Arg Arg Gly He 
230 

GAG GAC ATC GAT 
Glu Asp He Asp 
245 

CTA ATG TCC CAG 
Leu Met Ser Gin 
260 

TTT ATT GCG GCA 
Phe He Ala Ala 

AAT CCC GCT ACT 
Asn Pro Ala Thr 

295 

TAC GAA GAC AAG 
Tyr Glu Asp Lys 
310 

TCC GTG ACG GAA 
Ser Val Thr Glu 
325 

TGG ATA TCG ATC 
Trp He Ser He 
340 

GCA CCA ATC AAT 
Ala Pro He Asn 



AAA GGT AAA AGT 
Lys Gly Lys Ser 

AAC AGA AGT GAC 
Asn Arg Ser Asp 

235 

GAG GGT TTT GAA 
Glu Gly Phe Glu 
250 

AGG AGT GTA GAG 
Arg Ser Val Glu 
265 

ACC TTC ATG GAA 
Thr Phe Met Glu 
280 

CTT CTG AAG GAG 
Leu Leu Lys Glu 

ACT GAA TGG GGC 
Thr Glu Trp Gly 

315 

GAT ATT CTT ACT 
Asp He Leu Thr 
330 

TAC TGC AAT CCT 
Tyr Cys Asn Pro 
345 

CTT TCT GAT CGT 
Leu Ser Asp Arg 
360 



AGC AAG AAG TAT 

Ser Lys Lys Tyr 
220 

TCC AAT GCT CCA 

Ser Asn Ala Pro 

GGT TAT GAT GAT 
Gly Tyr Asp Asp 

255 

AAG CGT TTT GGT 
Lys Arg Phe Gly 
270 

CAA GGC GGC ATT 
Gin Gly Gly He 
285 

GCT ATT CAT GTT 
Ala He His Val 
300 

AAA GAG ATT GGT 
Lys Glu He Gly 

GGG TTC AAG ATG 
Gly Phe Lys Met 

335 

CCA CGC CCT GCG 
Pro Arg Pro Ala 
350 

TTG AAC CAA GTT 
Leu Asn Gin Val 



AAC 672 
Asn 

CTT 72 0 

Leu 

240 

GAG 76 8 

Glu 

CAG 816 

Gin 

CCA 864 
Pro 

ATA 912 
He 

TGG 960 

Trp 

320 

CAT 1008 
His 



TTC 1056 
Phe 

CTT 1104 
Leu 
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CGA TGG GCT TTG GGA TCT ATC GAG ATT CTT CTT AGC AGA CAT TGT CCT 1152 
Arg Trp Ala Leu Gly Ser lie Glu He Leu Leu Ser Arg His Cys Pro 
370 375 380 

5 ATC TGG TAT GGT TAC CAT GGA AGG TTG AGA CTT TTG GAG AGG ATC GCT 1200 
lie Trp Tyr Gly Tyr His Gly Arg Leu Arg Leu Leu Glu Arg lie Ala 
385 390 395 400 

TAT ATC AAC ACC ATC GTC TAT CCT ATT ACA TCC ATC CCT CTT ATT GCG 124 8 

10 Tyr He Asn Thr He Val Tyr Pro He Thr Ser He Pro Leu He Ala 

405 410 415 

TAT TGT ATT CTT CCC GCT TTT TGT CTC ATC ACC GAC AGA TTC ATC ATA 12 96 

Tyr Cys He Leu Pro Ala Phe Cys Leu He Thr Asp Arg Phe He He 
15 420 425 430 

CCC GAG ATA AGC AAC TAC GCG AGT ATT TGG TTC ATT CTA CTC TTC ATC 13 44 

Pro Glu lie Ser Asn Tyr Ala Ser He Trp Phe He Leu Leu Phe He 

435 440 445 

20 

TCA ATT GCT GTG ACT GGA ATC CTG AAA CTG AAA TGG AAC GGT GTG AGC 13 92 

Ser lie Ala Val Thr Gly He Leu Lys Leu Lys Trp Asn Gly Val Ser 

450 455 460 

25 ATT GAG GAT TGG TGG AGG AAC AAC CAG TTC TGG GTC ATT GGT GGC ACA 144 0 

He Glu Asp Trp Trp Arg Asn Asn Gin Phe Trp Val He Gly Gly Thr 
465 470 475 480 

TCC ACC CAT CTT TTT GCT GTC TTC CAA GGT CTA CTT AAG GTT CTT GCT 14 88 

30 Ser Thr His Leu Phe Ala Val Phe Gin Gly Leu Leu Lys Val Leu Ala 

485 490 495 

GGT ATC AAC ACC AAC TTC ACC GTT ACA TCT AAA GCC ACA AAC AAA AAT 153 6 

Gly He Asn Thr Asn Phe Thr Val Thr Ser Lys Ala Thr Asn Lys Asn 
35 500 505 510 

GGG GAT TTT GCA AAA CTC TAC ATC TTC AAA TGG ACA GCT CTT CTC ATT 1584 
Gly Asp Phe Ala Lys Leu Tyr He Phe Lys Trp Thr Ala Leu Leu He 

515 520 525 

40 
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CCA CCA ACC ACC GTC CTA CTT GTG AAC CTC ATA GGC ATT GTG GCT GGT 16 3 2 

Pro Pro Thr Thr Val Leu Leu Val Asn Leu lie Gly lie Val Ala Gly 
530 535 540 

5 GTC TCT TAT GCT GTA AAC AGT GGC TAC CAG TCG TGG GGT CCG CTT TTC 1680 
Val Ser Tyr Ala Val Asn Ser Gly Tyr Gin Ser Trp Gly Pro Leu Phe 
545 550 555 560 

GGG AAG CTC TTC TTC GCC TTA TGG GTT ATT GCC CAT CTC TAC CCT TTC 172 8 

10 Gly Lys Leu Phe Phe Ala Leu Trp Val lie Ala His Leu Tyr Pro Phe 

565 570 575 

TTG AAA GGT CTG TTG GGA AGA CAA AAC CGA ACA CCA ACC ATC GTC ATT 17 76 

Leu Lys Gly Leu Leu Gly Arg Gin Asn Arg Thr Pro Thr He Val He 
1 5 580 585 590 

GTC TGG TCT GTT CTT CTC GCC TCC ATC TTC TCG TTG CTT TGG GTC AGG 18 24 

Val Trp Ser Val Leu Leu Ala Ser He Phe Ser Leu Leu Trp Val Arg 

595 600 605 

20 

ATC AAT CCC TTT GTG GAC GCC AAT CCC AAT GCC AAC AAC TTC AAT GGC 18 72 

He Asn Pro Phe Val Asp Ala Asn Pro Asn Ala Asn Asn Phe Asn Gly 

610 615 620 

25 AAA GGA GGT GTC TTT TAGACCCTAT TTATATACTT GTGTGTGCAT ATATCAAAAA 192 7 

Lys Gly Gly Val Phe 

625 

CGCGCAATGG GAATTCCAAA TCATCTAAAC CCATCAAACC CCAGTGAACC GGGCAGTTAA 198 7 

30 

GGTGATTCCA TG TCC AAG AT TAG C TTT CTC CG AG TAG C CA GAGAAGGTGA AATTGTTCGT 204 7 

AACACTATTG TAATGATTTT CCAGTGGGGA AGAAGATGTG GACCCAAATG ATACATAGTC 2107 

35 TACAAAAAGA AT TAGTT AT A ACTTTCTTAT ATTT AT TTT A TTTAAAGCTT GTTAGACTCA 2167 

CACTTATGTA ATGTTGGAAC TTGTTGTCCT AAAAAGGGAT TGGAGTTTTC TTTTTATCTA 222 7 



AGAATCTGAA GTTTATATGC T 

40 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 629 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

<ii> MOLECULE TYPE : protein 
10 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala 
15 10 15 

15 Leu Val Ala Lys Ala Gin Lys lie Pro Gly Glu Gly Trp Thr Met Gin 

20 25 30 

Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Arg Asp His Pro Gly Met 
35 40 45 

20 

lie Gin Val Phe Leu Gly His Ser Gly Gly Leu Asp Thr Asp Gly Asn 
50 55 60 

Glu Leu Pro Arg Leu lie Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe 
25 65 70 75 80 

Gin His His Lys Lys Ala Gly Ala Met Asn Ala Ser lie Arg Val Ser 

85 90 95 

30 Ala Val Leu Thr Asn Gly Ala Tyr Leu Leu Asn Val Asp Cys Asp His 

100 105 110 

Tyr Phe Asn Asn Ser Lys Ala lie Lys Glu Ala Met Cys Phe Met Met 
115 120 125 

35 

Asp Pro Ala lie Gly Lys Lys Cys Cys Tyr Val Gin Phe Pro Gin Arg 
130 135 140 

Phe Asp Gly lie Asp Leu His Asp Arg Tyr Ala Asn Arg Asn lie Val 
40 145 150 155 160 
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Phe Phe Asp lie Asn Met Lys Gly Leu Asp Gly He His Gly Pro Val 

165 170 1? 5 

Tyr Val Gly Thr Gly Cys Cys Phe Asn Arg Gin Ala Leu Tyr Gly Tyr 
5 180 185 190 

Asp Pro Val Leu Thr Glu Glu Asp Leu Glu Pro Asn He He Val Lys 
195 2 oo 205 

10 Ser Cys Cys Gly Ser Arg Lys Lys Gly Lys Ser Ser Lys Lys Tyr Asn 
210 215 220 

Tyr Glu Lys Arg Arg Gly He Asn Arg Ser Asp Ser Asn Ala Pro Leu 
225 230 23 

15 



5 240 



Phe Asn Met Glu Asp lie Asp Glu Gly Phe Glu Gly Tyr Asp Asp Glu 

245 250 255 



Arg Ser He Leu Met Ser Gin Arg Ser Val Glu Lys Arg Phe Gly Gin 

270 



20 260 265 



Ser Pro Val Phe He Ala Ala Thr Phe Met Glu Gin Gly Gly He Pro 
275 280 285 

25 Pro Thr Thr Asn Pro Ala Thr Leu Leu Lys Glu Ala He His Val He 
290 295 300 

Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp Gly Lys Glu He Gly Trp 
305 310 315 320 

30 

He Tyr Gly Ser Val Thr Glu Asp He Leu Thr Gly Phe Lys Met His 

325 330 335 



Ala Arg Gly Trp He Ser He Tyr Cys Asn Pro Pro Arg Pro Ala Phe 

350 



35 340 345 



Lys Gly Ser Ala Pro He Asn Leu Ser Asp Arg Leu Asn Gin Val Leu 
355 360 365 
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Arg Trp Ala Leu Gly Ser lie Glu lie Leu Leu Ser Arg His Cys Pro 
370 375 380 



lie Trp Tyr Gly Tyr His Gly Arg Leu Arg Leu Leu Glu Arg lie Ala 
5 385 390 395 400 

Tyr lie Asn Thr lie Val Tyr Pro lie Thr Ser lie Pro Leu lie Ala 

405 410 415 

10 Tyr Cye lie Leu Pro Ala Phe Cys Leu lie Thr Asp Arg Phe lie lie 

420 425 430 



Pro Glu lie Ser Asn Tyr Ala Ser lie Trp Phe lie Leu Leu Phe lie 
435 440 445 

15 

Ser He Ala Val Thr Gly lie Leu Lys Leu Lys Trp Asn Gly Val Ser 
450 455 460 



He Glu Asp Trp Trp 
20 465 

Ser Thr His Leu Phe 

485 

25 Gly He Asn Thr Asn 

500 

Gly Asp Phe Ala Lys 
515 

30 

Pro Pro Thr Thr Val 
530 

Val Ser Tyr Ala Val 
35 545 

Gly Lys Leu Phe Phe 

565 



Arg Asn Asn Gin Phe 
4 70 

Ala Val Phe Gin Gly 

490 

Phe Thr Val Thr Ser 

505 

Leu Tyr He Phe Lys 
520 

Leu Leu Val Asn Leu 
535 

Asn Ser Gly Tyr Gin 
550 

Ala Leu Trp Val He 

570 



Trp Val He Gly Gly Thr 
475 480 

Leu Leu Lys Val Leu Ala 

495 

Lys Ala Thr Asn Lys Asn 

510 

Trp Thr Ala Leu Leu He 

525 

He Gly He Val Ala Gly 
540 

Ser Trp Gly Pro Leu Phe 

555 560 

Ala His Leu Tyr Pro Phe 

575 
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Leu Lys Gly Leu Leu Gly Arg Gin Asn Arg Thr Pro Thr lie Val lie 

580 585 590 



Val Trp Ser Val Leu Leu Ala Ser He Phe Ser Leu Leu Trp Val Arg 

605 



5 595 600 



lie Asn Pro Phe Val Asp Ala Asn Pro Asn Ala Asn Asn Phe Asn Gly 
610 615 6 20 

10 Lys Gly Gly Val Phe 
625 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

<i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 8411 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii> MOLECULE TYPE: DNA {genomic) 

(iiil HYPOTHETICAL: NO 



(iv> ANTI -SENSE: NO 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thai i ana 

(B) STRAIN: Columbia (wild-type) 



20 (vii) IMMEDIATE SOURCE: 

(B) CLONE: 23H12 RSW1 GENE 



(ix) 

25 



< ix) 

30 

(ix) 

35 

(ix) 



40 (ix) 



FEATURE : 

(A) NAME/ KEY : 

(B) LOCATION: 

FEATURE : 

(A) NAME / KEY : 

(B) LOCATION: 

FEATURE : 

(A) NAME /KEY : 

(B) LOCATION: 

FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE ! 



exon 

2296 . .2376 



exon 

2904 . .3099 



exon 

3198 . .3370 



exon 

3594 . .3708 
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(A) NAME /KEY: exon 

(B) LOCATION: 3 824.. 4 01 3 



tix) FEATURE: 

(A) NAME /KEY : exon 

<B) LOCATION; 4181.. 4447 



10 



(IX) FEATURE: 

.(A) NAME /KEY: exon 

(B) LOCATION: 4783.. 5128 



< IX) 



15 

(ix) 



20 



FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE : 

(A) NAME /KEY : 

(B) LOCATION: 

FEATURE : 

(A) NAME / KEY : 

(B) LOCATION: 



exon 

5207. .5344 



exon 

5426 . . 5551 



exon 

5703 . . 5915 



(ix> FEATURE: 
25 (A) NAME /KEY: exon 

(B) LOCATION: 6022.. 6286 



(ix) 

30 

(ix) 



35 

(ix) 



FEATURE : 

(A) NAME /KEY : 

(B) LOCATION: 

FEATURE : 
fA) NAME /KEY : 
(B) LOCATION: 

FEATURE : 

(A) NAME /KEY : 

(B) LOCATION: 



exon 

6374 . . 6570 



exon 

6655. .7005 



exon 

7088 , . 8032 



40 
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ixi) SEQUENCE DESCRIPTION : SEQ ID NO : 3 : 

TTAGAAGAAG CCTGAGCCGG AGTCCTATTC AATTATCTAG AAGAAGTCTG AG CCGG AGTC 6 0 

5 CCACTCGATT GTCTAGGAGA AGCCTAAGCC GGAGTCCCAT TCGATCACCT AGGAAGAGTG 120 

TG AG CAGGAG TCCAGTCCGA TCATCTAGGA AGAGTGTGAG CAGAAGTCCG GTTCGTTCAT 180 

CCAGGAGACG TATCAGCAGG AGTCCAGTCC GATCATCTAG GAAGAGTGTG AG CAGGAG T C 24 0 

10 

CTATTCGATT GTCCAGAAGA AG TAT C AG C A GG AG T C C TAT TCGATTGTCC AGGAGAAGTA 3 00 

TCAG CAGGAG TCCTGTTAGA GGAAGAAGAA GAATTAGCAG AAGTCCAGTT CCGGCAAGGA 360 

15 GAAGGAGTGT GCGGCCTAGA TCTCCTCCTC CTGACCGCAG AAGAAGTTTG TCAAGAAGTG 42 0 

CTTCTCCTAA TGGGCGCATA AGGAGAGGGA GAGGATTTAG CCAAAGATTC TCATACGCCC 4 80 

GTCGATACAG AACTAGTCCA TCTCCTGATC GATCTCCTTA TCGCTTTAGT GATAGGAGTG 54 0 

20 

ACCGTGACAG GTGAATAGCC CACACATAAT ATAACTCCCC CTTTCTGTTA CACACTCTCG 6 00 

TACTGAACCG TCTTTTTTAT AACGTCTTCT C TG TAG ATT T AGAAGTCGCA GAAGGTTCTC 6 60 

25 GCCAAGTCGG TTCAGAAGCC CACTAAGAGG AAGAACACCT CCAAGGTACT TATCCTCTTT 720 

AGTACATTGT TTCAGCTGAT TCTTTACATC TAAAAGTTTC ATGAATATGG AACTAAAATT 780 

GGTGATCCAA AAGAATTATT CTTGATTTCA CAACTCGAAA GTATGCTCAG GTATAGAAGA 84 0 

30 

AGAAGCCGCT CAGTATCGCC TGGTCTCTGT TATCGCAACC GGCGGTACAG CCGCAGCCCT 900 

ATCCGTAGCC GATCTCCACC TTACAGAAAG AGAAGGTCAC CATCCGCTAG CCACAGCCTG 960 

35 AGTCCATCGA GGTCAAGATC AAGATCAAAG TCATAT TCAA AATCTCCCAT TGGGACGGGG 1020 

AAAG CAAGAT CAGTGTCAAG ATCACCATCC AAGGCAAGGT CTCCATCGAA GTCGGATTCG 1080 

ACATCCTCGG ATAATAGCCC AG GTGGG AAA AAGGGATTAG TAGCCTATGA TTAATGAATA 114 0 

40 
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ATGATTACCC TTAAGTTAAG TGTTTGTTCT 
AGTAGTTTAC TTCTGTAAAA CATAAGCATT 
5 CCAAGATTGT TAAAAATTTC TGTTGATGTT 
CGACAAATGT TAACTTCCAT TATTCGTTGC 
TCATAAAATT AAGCATAACT AAATGTGACG 

10 

GTAATTTATT TATTTGGATA ATCAATATAA 
TCACTATACC ACCGTCATTA TCACTATCAC 
1 5 ATGGAATTTT TTTTG TTAAA AGTTTATTGA 
GTATACTGAA TTAAAACTTG TAAATATAAC 
T C CAT TTAAA AATTATTTGC GAATTCGCCC 

20 

ATCAACACTA GTTATTTTGA GTCCACCGAC 
GTACCAGAGC CAACAACAAC GTGGCTTCTT 
25 ATACTTTGTC CTATCTCTTT CTTGCAATAA 
GCGCGTCTAG TGGGGAAGCC AG AACGG CTC 
GCCGAATAGA GCCGAGCTGA GCTAAAACGG 

30 

CCATCTCCGG TAAAATAATG TACTTGTCAT 
CGATAAAATA GGCAAAAGCA GATTTGAAGA 
35 GAG AG TG ACA GAGGAGTGTG TGAACATCCT 
GTATTGAATC GGCTACGAAT TTCCCAATTT 
GTGTCGGTGG CTGCGATGGA GGCCAGTGCC 

40 
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TTTTACTGAG AAGAGATGGT AAAGAGAGTA 12 00 

GTCTTTTGCG TATGTTTGTT TGATTATGCT 126 Q 

TGCCGACATT TTTTCTTTGT TGCCATTTGC 13 2 0 

GGAGTTGGTT TTGGTCCAAT AATTAAACTT 13 80 

TTTGTCACCA AACTTTAGAA CAACGACATC 144 0 

TTTACGATTT CTTCCTACAT ATATATCATA 1500 

TAAATATAAA AATGTTAAAA TGATTTCTTA 156 0 

CACAAAAAAT GAATTAAAAC TCAGAAATCT 162 0 

AACAAAATGG GATTAAAAAA AGAAGTGGCA 16 60 

GTAACTTCTT AAGCTAACAA TTAGAACCTA 174 0 

AGGTGATAGC AAATAAAAAA GAACAGG CTG 18 0 0 

CTTTTTTTTT TTTAATATAA TCAAACAATC 18 60 

GATTTTGCCA CGTCACATAC TAAGAAGCTG 1920 

ACTTTAAAAA GTAGAGAGAT GATAACTTGA 198 0 

TGGGAGAGGA AGAGGCTACT ACTACCGTCA 204 0 

TTAAAAATTA AGAAAAAACA CATC ACT CTG 2100 

AGAAGCAGCT TGAGATATCA AATAGAGAGA 216 0 

TTTTTAGTAG ATTTGGGTTT TCGAGATGCC 22 2 0 

TGAATTTTGT GAATCTCTCT CTTTCTCTGT 22 80 

GGCTTGGTTG CTGGATCCTA C CGGAG AAA C 2 34 0 
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GAGCTCGTTC GGATCCGACA TGAATCTGAT GGCGGGGTCT GTTCATCTTC CCTTTTTCCC 24 0 0 

ATTTTTTTGT TATTGTTTTT CGTTCTTACA ATTTTTGATG TGTAGATCTC ATCTAGATTT 24 60 

5 CTCTGTTTCT AAATCTCGTC TCTT TTGG AT CCATAATTGG ATCATTGAAA CTCAGATTTC 252 0 

GCTTCCTTTG ACTGTGTAGT T AG TT AG TG T CAGTTGATCA AGTAAGTGTC TGAAAATGGA 2 5B0 

AACTTTTCTG CTCCAATTCT TCAAATTGTT GTGATCTATA TCAATTAATG CCGCATCTGT 264 0 

10 

TTTCTTAAAA TCTCTTATGG AAAGTGTCGG TGGATTTCAG TTCGTTAACT TTTTTAAGCT 2 700 

AAAATCTTTG ACTCTTAAAG TTTAGCTTTA CTTATTGAGA TTTAGCTCAA CTAGATCTCG 2 76 0 

15 TTAG TTCCCG CCATGGGATA CAGACTGTGA CTCGCCTTAA TTCAGATCTG CATTGATTGT 2 82 0 

TTTGATTTAG ATCCTTGCTC ATCTCTTTCT GTAGTTTCTA ATACTCAATG ACTAACAATG 28 8 0 

ATGCAATGTT GGTCAAAGTG CAGACCAAAC CTTTGAAGAA TATGAATGGC CAGATATGTC 2 94 0 

20 

AGATCTGTGG TGATGATGTT GGACTCGCTG AAACTGGAGA TGTCTTTGTC GCGTGTAATG 3 0 00 

AATGTG C CTT CCCTGTGTGT CGGCCTTGCT ATGAGTACGA GAGGAAAGAT GGAACTCAGT 3 06 0 

25 GTTGCCCTCA ATGCAAGACT AGATTCAGAC GACACAGGGG TCAGTTGTCT TTTTCTTTTT 312 0 

GTTGGCAATT GCTATATATG GATTTTCTCT TTTTGTTTCT TTGCTGTTGT GTTGAACAAT 3180 

TTTTTGGAAT TTTCCAGGGA GTCCTCGTGT TGAAGGAGAT GAAGATGAGG ATGATGTTGA 324 0 

30 

TGATATCGAG AATGAGTTCA ATTACGCCCA GGGAGCTAAC AAGGCGAGAC ACCAACGCCA 3 3 00 

TGGCGAAGAG TTTTCTTCTT CCTCTAGACA TGAATCTCAA CCAATTCCTC TTCTCACCCA 3 360 

35 TGGCCATACG GTAGGGACCT ACATTTTCCC TTTAGACTCT AGAGTGATTT GTATTACTCA 34 2 0 

ATAAATCCCT AGAGTGGTCA TTTATTACTT ACTATTCACG TTAATGTTAT ATGTGAACAA 34 80 

ATCTTAACAG AATTTTTTTC TGATAGTACA TGGTCATCCA AATTAAGAAA TAATAATAGA 3 54 0 

40 
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TGTTGTTAGT TGTGTCTGTT TT CAAT AG AT TCATGACCTT TTTCTATACA CAGGTTTCTG 36 00 

GAGAGATTCG CACGCCTGAT ACACAATCTG TGCGAACTAC ATCAGGTCCT TTGGGTCCTT 3 66 0 

5 CTGACAGGAA TGCTATTTCA TCTCCATATA TTGATCCACG GCAACCTGGT ATTCATATGT 3 720 

TTTTCCCTTG TGCACGTGGT CTTTGTTAAA TGTGATTCCT ATTCATTTTT ACAA CAT ATA 3 78 0 

TATTTTGTGT ACCGTAACTG ATAGCTCCCG CTAAAAATTG CAGTCCCTGT AAGAATCGTG 3 84 0 

10 

GACCCGTCAA AAGACTTGAA CTCTTATGGG CTTGGTAATG TTGACTGGAA AGAAAGAGTT 3 900 

GAAGGCTGGA AGCTGAAGCA GGAGAAAAAT ATGTTACAGA TGACTGGTAA ATACCATGAA 3 960 

15 GGGAAAGGAG GAGAAATTGA AGGGACTGGT TCCAATGGCG AAGAACTCCA AATGTAAGTG 4 02 0 

GAAATACTAG ACCAATATCT TTATTGTCCA ACTCAAACAG CTCTTGGCCG TGATGCTAAT 4 080 

AAC CACTCTT GGTTTCTTAT TATGTATTGA TAGACATAAT TAAGTATCTG CTTTGTTACA 414 0 

20 

TTTGTTTCCT TCCACTCAAT TATGGTTCTC GTACTTACAG GGCTGATGAT ACACGTCTTC 4 200 

CTATGAGTCG TGTGGTGCCT ATCCCATCTT CTCGCCTAAC CCCTTATCGG GTTGTGATTA 4 260 

25 TTCTCCGGCT TATCATCTTG TGTTTCTTCT TGCAATATCG TACAACTCAC CCTGTGAAAA 43 20 

ATGCATATCC TTTGTGGTTG ACCTCGGTTA TCTGTGAGAT CTGGTTTGCA TTTTCTTGGC 4 3 80 

TTCTTGATCA GTTTCCCAAA TGGTACCCCA TTAACAGGGA GACTTATCTT GACCGTCTCG 44 4 0 

30 

CTATAAGGTT GGT CTTTAAG TTTATACATC CCCTACTCTC ATCTCTCTTT TATGTATTAA 4S00 

CTTGATATCT TCTATCACAG TTTTCGATAG TTGACTTTTT CCCCCTGTAA ATTTAATTTA 4 5 60 

35 AATTTAGACA ATGGTGCATC TGAATTTTGA TTATGATATA TCTTAAGAAG ATTATGATTG 4 6 20 

TAAATCTTGA AATTTAGTAG AAAACCATCT GCAATCTACT GACCATGTGA AGTTTCCGAC 4 6 80 

TAGACTATGA TAGAAGCATG CCAAGTGGAG TGTTTATTAA GATAGAGCTT AG CTATT ATA 4 74 0 

40 



BNBDOCtD: 4WO_Q60064AM JL> 
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CTGATTTTAT 
AC CAT CA C AG 
5 TCCCCTTGTT 
AG TAG C CTG T 
AACCGCTGAG 

10 

GGCCCCTGAA 
TTTTG TTAAA 
1 5 TACGGCAAAG 
GTTTAAAGTG 
GACAATGCAG 

20 

ACAGGTACAG 
TTACATCGTT 
25 TGGAAATGAG 
CCACAAAAAG 
CCTCTATTTT 

30 

CACATTCTTT 
AGATCCGTGT 
35 ATTACTTTAA 
TTGGAAAGAA 
ATCGATATGC 

40 



ATGTGTTTTG ATTTTTTGGT 
CTCGTTCCTG TTGATGTGTT 
ACAGCAAACA CAGTTCTCTC 
TATGTTTCAG ATGATGGTTC 
TTTG CAAAG A AATGGGT AC C 
TTCTATTTTG CCCAGAAGAT 
GAGCGACGAG CTATGAAGGT 
AGATTGACTG ACTTTTTCTT 
AGGATAAATG CTCTTGTTGC 
GATGGTACTC CCTGGCCTGG 
TGTGGCAATC CCTTGATTGT 
TTGTTTCAAT TTCAGGTGTT 
CTGCCTAGAC TCATCTATGT 
GCTGGAGCTA TGAATGCATT 
ATTCTCTTGT TCACTGCCTA 
TTTTTCTAGG CTATGTGTTC 
ATCTGCTGTT CTT AC CAATG 
TAACAGTAAG GCTATTAAAG 
GTGCTGCTAT GTCCAGTTCC 
CAACAGGAAT ATAGTCTTTT 



- 100- 

TTCTTATTGT AGATATGATC 
TGTTAGTACA GTGGACCCAT 
GATTCTTTCT GTGGACTACC 
AGCTATGCTT ACCTTTGAAT 
ATTTTG CAAG AAATTCAACA 
AGATTACTTG AAGGACAAGA 
CATTTGAAAA GTCCACCTGC 
TGGTTTGTAT TGACAGAGAG 
CAAAG CAC AG AAAATCCCTG 
TAACAACACT AGAGATCATC 
GACAGAGAGG ATAACGTAAA 
CTTAGGCCAT AGTGGGGGTC 
TTCTCGTGAA AAGCGGCCTG 
GGTTTGTTAA CTTTCAGAAT 
AGAAACGTTC TTCTTGTGTA 
TCTCCTAATT TAGTATCTCT 
GAGCATATCT TTTGAACGTG 
AAGCTATGTG TTTCATGATG 
CTCAACGTTT TGACGGTATT 
TCGATGTGAG TATCACTTCC 
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GAGACGGTGA 4 8 00 

TGAAAGAGCC 4 860 

CGG TAG ATAA 4 920 

CC CTTTCTG A 4 9 8C 

TTGAACCTAG 504 0 

TCCAACCGTC 5100 

TT CT CAT C CA 516 0 

AG T ATGAAG A 522 0 

AAGAAGG CTG 52 8 0 

CTGGAATGAT 53 4 0 

GGAAACATGT 54 00 

TGGATACCGA 54 6 0 

GATTTCAACA 552 0 

CCTATTGTGT 55 80 

GCCGTTGCTT 564 0 

TTACTTTGAC 5 700 

GATTGTGATC 5 76 0 

GACCCGGCTA 5 820 

GATTTGCACG 5 8 80 

CCATTGTCTT 594 0 



WO 98/00549 

TTGTTTCTCT TTTGTTCATA TTTTGGTTGG 
GATATTTGTT CTCTTGGGCA GATTAACATG 
5 TATGTGGGTA CTGGTTGTTG TTTTAATAGG 
ACGGAAGAAG ATTTAGAACC AAATATTATT 
GGTAAAAGTA GCAAGAAGTA TAACTACGAA 

10 

AATGCTCCAC TTTTCAATAT GGAGGACATC 
ATTGTGTAAT AACATCACTT CTTTATGTAA 
1 5 CTTGTTTATG CAGGTTATGA TGATGAGAGG 
AAG CGTTTTG GTCAGTCGCC GGTATTTATT 
CCACCAACAA CCAATCCCGC TACTCTTCTG 

20 

TACGAAGACA AGACTGAATG GGG CAAAGAG 
CTTATGTTCT CTTTCTTACC TGTTTGATGA 
25 TGGATCTATG GTTCCGTGAC GGAAGATATT 
TGGATATCGA TCTACTGCAA TCCTCCACGC 
CTTTCTGATC GTTTGAACCA AGTTCTTCGA 

30 

AG C AG AC ATT GTCCTATCTG GTATGGTTAC 
G CTTATATCA ACACCATCGT CTATCCTATT 
35 CTTCCCGCTT TTTGTCTCAT CACCGACAGA 
CACACTGCTA TTTACTATTT GAATCCCATT 
GTTGCAGATA AGCAACTACG CGAGTATTTG 

40 
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ATTTACTCGT TTCTGCTATG GCCTGACTTG 60 0 0 

AAGGGGTTGG ATGGTATCCA GGGTCCAGTA 606 0 

CAGGCTCTAT ATGGGTATGA TCCTGTTTTG 612 0 

GTCAAGAGCT GTTGCGGGTC AAGGAAGAAA 6180 

AAGAGGAGAG G CAT CAACAG AAGTGACTCC 6 24 0 

GATGAGGGTT TTGAAGGTTT GATTGAGCTG 6 3 00 

TGATTTATGT GATGGTGAAA TCTTACAATC 63 6 0 

TCTATTCTAA TGTCCCAGAG GAG TG TAG AG 6 4 20 

GCGGCAACCT TCATGGAACA AGGCGGCATT 64 80 

AAGGAGGCTA TTCATGTTAT AAGCTGTGGT 6 54 0 

GTCAGTTTTC AAATGCAGCT ACAGAATCTT 6 6 00 

CATCTTATTT GGCACTTTTG TTAGATTGGT 6660 

CTTACTGGGT TCAAGATGCA TGCCCGGGGT 6720 

CCTGCGTTCA AGGGATCTGC ACCAATCAAT 6 7 80 

TGGGCTTTGG GATCTATCGA GATTCTTCTT 684 0 

CATGGAAGGT TGAGACTTTT GGAGAGGATC 6 90 0 

ACATCCATCC CTCTTATTGC GTATTGTATT 6 96 0 

TT CAT CATAC CCGAGGTTTG TAAAACTGAC 70 2 0 

TTGTGAATGC ATTTTTTTGT CAT CAT CAT T 7080 

GTTCATTCTA CTCTTCATCT CAATTG CTGT 714 0 
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GACTGGAATC CTGGAGCTGA GATGGAGCGG TGTGAGCATT GAGGATTGGT GGAGGAACGA 7 2 00 

GCAGTTCTGG GTCATTGGTG GCACATCCGC CCATCTTTTT GCTGTCTTCC AAGGTCTACT 7260 

5 TAAGGTTCTT GCTGGTATCG ACACCAACTT CACCGTTACA TCTAAAGCCA CAGACGAAGA 73 2 0 

TGGGGATTTT GCAGAACTCT ACATCTTCAA ATGGACAGCT CTTCTCATTC CACCAACCAC 73 80 

CGTCCTACTT GTGAACCTCA TAGGCATTGT GGCTGGTGTC TCTTATGCTG TAAACAGTGG 74 4 0 

10 

CTACCAGTCG TGGGGTCCGC TTTTCGGGAA GCTCTTCTTC G CCTTATGGG TTATTGCCCA 7 50 0 

TCTCTACCCT TTCTTGAAAG GTCTGTTGGG AAGACAAAAC CGAACACCAA CCATCGTCAT 7 56 0 

15 TGTCTGGTCT GTTCTTCTCG CCTCCATCTT CTCGTTGCTT TGGGTCAGGA TCAATCCCTT 76 2 0 

TGTGGACGCC AATCCCAATG CCAACAACTT CAATGGCAAA GGAGGTGTCT TTTAGACCCT 76 8 0 

ATTTATATAC TTGTGTGTGC ATATATCAAA AACGCGCAAT GGGAATTCCA AATCATCTAA 7 74 0 

20 

ACCCATCAAA CCCCAGTGAA CCGGGCAGTT AAGGTGATT C CATGTCCAAG ATTAGCTTTC 780 0 

TCCGAGTAGC CAGAGAAGGT GAAATTGTTC GTAACACTAT TGTAATGATT TTCCAGTGGG 7 86 0 

25 GAAGAAGATG TGGACCCAAA TGATACATAG TCTACAAAAA GAATTTGTTA TTCTTTCTTA 7 92 0 

TATTTATTTT ATTTAAAGCT TGTTAGACTC ACACTTATGT AATGTTGGAA CTTGTTGTCC 7 98 0 

T AAAAAGG G A TTGGAGTTTT CTTTTTATCT AAGAATCTGA AG TTT ATATG CTAAGCTTTT 804 0 

30 

CACTTTACTA CAAAAAGTTT ATGGATATGA TGGTGTACGT CAATTGTTGG TGCAAGTGTT 8100 

GATGTCTTCG GGTGAACTCG CCCTCTTGTT TTGTCTCACC CATCAGTACA AATAGAATGA 8160 

35 CATTTATTTT TTTGAACTTT TAACGAAATC TTTGTCATTA TGGGACTTGA TCAGTAAAGT 8220 

TACATATTTG AAGAGATATT GTGTAAACTC TTATTTGAAT CAGAATCAGA TCAATCAAAA 82 8 0 

ATTGAAAACG TAAAGTTCAA ACAAAAAGGT AGAGTGAATC TTTTAATCCC CCCTCAATAC 834 0 

40 
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TAATTTGTGA AATCTCAAGT GGTGTAAAAT GAACCCAATT AG T A T C C AC A ATGTGTTTCT 84 00 



CTGATCAATC C 8411 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH: 5009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 (ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: NO 



20 



<iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 
(B> STRAIN: Columbia 



25 (vii) IMMEDIATE SOURCE: 

{ B ) CLONE: 12C4 

(ix) FEATURE: 

(A) NAME /KEY : exon 
30 (B) LOCATION: 8 63.. 94 3 

(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1454.. 1840 



35 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1923.. 202S 



40 (ix) FEATURE 
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(A) NAME /KEY: exon 

(B) LOCATION: 2122.. 2311 

(ix) FEATURE: 

5 (A) NAME /KEY : exon 

(B) LOCATION: 2421.. 2687 

(ix) FEATURE: 

(A) NAME /KEY: exon 

10 (B) LOCATION: 2776.. 3121 

(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 3220.. 3357 

15 

(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 3507.. 3623 

20 (ix) FEATURE: 

(A) NAME /KEY : exon 

{B ) LOCATION: 3723.. 3935 

(ix) FEATURE : 

25 (A) NAME /KEY: exon 

(B) LOCATION: 4027.. 4297 

(ix) FEATURE: 

(A) NAME / KEY : exon 

30 ( B ) LOCATION: 4380.. 4576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
35 AAGGAATAAT AAGATAGGGG TTTAATGGGA GACAATCAAT CTTCAGGGGT TTTCTGGAAN 60 
AACGGCGGGG TAAAAAACAA GACATCAATC GG A C CCGATC ACGAGGACCC GGATCCGNAT 120 
CGATAAACAG NGTAGCTTTC AATACCCCAT TTTCCCAGAA ACACCTCTCA AAAATTTTTT 180 

40 
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CAAGAACTNG TATAAATATC TCAGTTTCGT 
TNTNTTCATN GTTCACCAAC TCCCTCTTGA 
5 CATAGCCATC GCGTCGTTTT CTCCGGGACC 
TATACATACA ATTGTTTTCA GTCTCAATTT 
AGGGGTGGTG TCTGAATCTC GTCTCTCTCA 

10 

AACCCTTCCA CATTGCTTTT GTCAGTCTGT 
CTTAAATCCA AAACAGTTTT TTTTTCTTTC 
15 GCTGTCTCCG GGAAAATTCG TTTTTTTTCT 
TTTTATTTAA TAATTATCCC CGAGCCAACA 
TTCGTCTTCC ACT CTT ACTA GTGCATGCTC 

20 

CTGGATCCAT TATCCTAGCC GGGTCGGGTC 
TTGATTCGGT GTAGAAGACA T CATGAAT AC 
25 CAGAAACGAA TTCGTTCTCA TTAACGCCGA 
ANGAATTTGT GACGGAAAAA AGTTTAATTT 
AATCTAGATG GAATATTTTG AT CTGAAATT 

30 

ACATGTTCTG TTTTTTCTTT TTTCTTTTCT 
GGCAGAGATG TCCTGAGAAC CGAATTCAAT 
35 GTCCATTTTT TTATATTACT AATTCTGTTC 
TTCACCTGGA TTCAGATACT AATAACTGTC 
ATTCAGTTTC ACAATTATGT AATTCATAAT 

40 
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- 105 - 

TCACGCAGGT CTTTNTTATT TTGGNAANTC 24 0 

AGGTGGGACA GAGTCCAGCT CCACCACCAC 300 

CACTTATTTC GTGACGTTTC TCTCTTTGTA 360 

GCTGTCCACA TTTTAACACA ACTCTATCTC 420 

TTCCTATTTA TCCCAATCTA ATCTATCACA 4 80 

AAAATTCTCT TTGAATCAGT GAATCACTCA 54 0 

TTTCTTTATT TGCTTGTTGT GGAATCAATA 600 

CCTTCGGGAT CTCTTTTTTT TTTTTTTTGG 660 

TTTATTGTCG ATTCGGTTTA TTTCGTCTCC 72 0 

TGAATCTGTA TGTAATGGGA GTT CAACAGT 7 80 

AAGGTCTTTG AG TAAG AG AG ACAATTCGTT 84 0 

TGGTGGTCGG CTCATTGCTG GCTCTCACAA 900 

TGAGAGTGCC AGAGTAAGAA TAACTTTTGT 960 

TTTCTCTTTC TTGGGGATCT AGATTATGAG 1020 

GGAAGTTTCT AGGGAGTAAT GCCGCAACCC 1080 

TCAAGTAGTG TTGCATGATT CATACGTGTC 1140 

GTTGTAGCAG TAGCAATAAG TTCAAAGAAA 1200 

TTGGTTTATT TGAGCTGGTC TTTATTGCAT 12 60 

TCAATTATGT AAAAATGACA ACTTTATGAA 13 20 

CGATGAATGT TTTTCTTGAG TCTTTATCAT 13 80 
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CTTTAGGATT TGATTAAGAT GCAATTTGAT 
TTTCTCTATG TAGATACGAT CAG T ACAAG A 
5 AGATGAAATC GAATTAACGG T TAG CAG TG A 
CCCGGTTTGT AGACCATGCT ATGAGTATGA 
GTGCAAAACT CGATACAAAA GGATTAAAGG 

10 

AGAAGAAGAC ATTGATGATC TTGAGTATGA 
CGCTGAAGCC GCACTCTCTT CACGCCTTAA 
15 ACCTGGCTCT CAGATTCCTC TTTTGACTTA 
GTT TTCTCTG ACAATGTTGT TGCTTAGATG 
AGGATGCTGA TATGTATTCT GATCGTCATG 

20 

GGAATCGCGT CTATCCTGCA CCGTTTACAG 
TATGATTCCT ACAATTTTTC TTCTTATATG 
25 GGTTTTGTTT GTTGTTTTCA GCACAGGCGA 
AATATGGTTA TGGAAGTGTT GCTTGGAAGG 
GCGAAAAGCT TCAAGTCATT AAGCATGAAG 

30 

ACGACGAACT AGATGATCCT GACATGCCTA 
GATGAAATGA TG CTCTG AAA TTTTGTGTTC 
35 CATTTTTCGT GCTAATTCAG GATGGATGAA 
ATTCGTTCAA G CAG AAT AAA TCCTTACAGG 
GGTCTTTTCT TTCATTATAG AATT C TC CAT 

40 

BNSOOaD: <WO_j860064flA1JU> 
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GAAAATACTA AAAAGACTCA TGTGTTCTCA 14 4 0 

ACTG AG TGGG CAAACATGTC AAATCTG TGG 1500 

GCTCTTTGTT GCTTGCAACG AATGCGCATT 156 0 

ACGTAGAGAA GGAAATCAAG CTTGTCCTCA 1620 

TAGTCCACGG GTTGATGGAG ATGATGAAGA 16 80 

GTTTGATCAT GGGATGGACC CTGAACATGC 1*740 

CACCGGTCGT GGTGGATTGG ATTCAGCTCC 18 00 

TTGTGATGAA G TG AGG AAT C CAAATTGTTT 186 0 

ATTCTTTTTC TTATTAGTCT ATGTGTTTTC 1920 

CTCTTATCGT GCCTCCTTCA ACGGGATATG 1980 

ATTCTTCTGC ACCTCGTATG TGTTTACTTT 2040 

ATTTGGTCAC CTTCTAATGA GTTATGAAAT 2100 

GATCAATGGT TCCTCAGAAA GATATTGCGG 216 0 

ACCGTATGGA AGTTTGGAAG AGACGACAAG 222 0 

GAGGAAACAA TGGTCGAGGT TCCAATGATG 228 0 

TGTAAGTTGT TAAAATCTAA CAAAAGTTCA 234 0 

AATGGNTTTG TTTTCTTATT GTTGTTTAAA 24 00 

GGAAGACAAC CTCTCTCAAG AAAGCTACCT 246 0 

ATGTTAATTC TGTGTCGCCT CGCGATTCTT 252 0 

CCAGTCAATG ATGCATATGG ATTATGGTTA 258 0 
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ACGTCAGTTA TATGCGAGAT ATGGTTTGCA GTGTCTTGGA TTCTTGATCA ATTCCCCAAA 2 64 0 

TGGTATCCTA TAGAACGTGA AACATACCTC GATAGACTCT CTCTCAGGTA A CAT AAAC C C 27 00 

5 TG AAAAGTT C TTGTCTGCAA ATATTCATTT TTTACATTCC CAAAAATTTT TGAAACTCTA 276 0 

TTTTTCTTAC ATAAGGTACG AGAAGGAAGG AAAACCGTCA GG ATT AG CAC CTGTTGATGT 28 20 

TTTTGTTAGT ACAGTGGATC CGTTGAAAGA GCCACCCTTG ATTACAGCAA ACACAGTTCT 2 8 80 

10 

TTCCATTCTA GCAGTTGATT ATCCTGTGGA TAAGGTTGCG TGTTATGTAT CAAACAATGG 2 94 0 

TGCAGCTATG CTTACATTTG AAGCTCTCTC TGATACAGCT GAGTTTGCTA GAAAATGGGT 3 0 00 

15 T C CTTTTTG T AAGAAGTTTA ATATCGAGCC ACGAGCTCCT GAGTGG TATT TTTCTCAGAA 3 06 0 

GATGGATTAC CTGAAGAACA AAGTTCATCC TGCTTTTGTC AGGGAACGTC GTGCTATGAA 312 0 

GGTTTTCTTT GCTGCTTTTT CTCTTTCTGA GTATATCCTA TCATAAAAGT GTTGTTTCAA 3180 

20 

GAATCTGATT TACGTTTTTT GCTTGTTTGT TTGTTGCAGA GAG AT TATG A GGAGTTTAAA 324 0 

GTGAAGATAA ATGCACTGGT TGCTACTGCA CAGAAAGTGC CTGAGGAAGG TTGGACTATG 3 3 00 

25 CAAGATGGAA CTCCTTGGCC TGGAAACAAC GTCCGTGACC ATCCTGGAAT GATTCAGGTA 3 36 0 

ATGATGAGTT TGATTGAATA GGCAAAAAAA AAGCGGTTTT TGTCCTCTTC ACTTTGTTTC 3420 

CCTGGATCTG TTAAATTGGA ATGAGCACTC TACTTCTCAA TATATCTTCA GACCGAAGCC 34 8 0 

30 

TTTTTAAGAG ATTTTGTAAA TGACAGGTGT TCTTGGGTCA TAGTGGAGTT CGTGATACGG 354 0 

ATGGTAATGA GTTACCACGT CTAGTG TATG TTTCTCGTGA GAAGCGGCCT GGATTTGATC 3 600 

35 ACCACAAGAA AG CTGGAG CT ATGAATTCCT TGG TAAGTAT AATGTGTTTC TTTATTTATG 366 0 

AATCTCTCTT TTCGGAGCCC TGACTTCTCA TAAACTAAAA CTCATCTTAC TTCTTCTTGA 3 720 

AGATCCGAGT CTCTGCTGTT CTATCAAACG CTCCTTACCT TCTTAATGTC GATTGTGATC 3 780 

40 
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ACTACATCAA CAACAGCAAA G C AATT AG AG AATCTATGTG TTTCATGATG GACCCGCAAT 3 84 0 

CGGGAAAGAA AGTTTGTTAT GTTCAGTTTC CGCAGAGATT TGATGGGATT GATAGACATG 3 900 

5 ATAGATACTC AAACCGTAAC GTTGTGTTCT TTGATGTATG TGTCCTTATC T CTTTTG CTT 3 960 

TGTTTCTGTT TATGTTTTAG TGCTTTTCCT CTTTTCTCAT TTGATATTGT TTTGGTGTGG 4 02 0 

AAACAGATTA ACATGAAAGG TCTTGATGGG ATACAAGGAC CGATATATGT CGGGACAGGT 4 0 80 

10 

TGTGTGTTTA GAAAACAGGC TCTTTATGGT TTTGATGCAC CAAAGAAGAA G AAA CCACCA 414 0 

GGCAAAACCT GTAACTGTTG G C CTAAATGG TGTTGTTTGT GTTGTGGGTT GAGAAAGAAG 4 200 

15 AGTAAAACGA AAGCCAAAGA TAAGAAAACT AACACTAAAG AGACTTCAAA GCAGATTCAT 4260 

G CG CT AG AG A ATGTCGACGA AGGTGTTATC GTCCCAGGTA AAAAAAGAAG GAAAAAAAAA 4 320 

ACATTTCTTA TTTGGTTTCT GTCTTGTTGA AAGTCTAAGT AGATCCTTTT GATTGTTAGT 4 3 80 

20 

G T CAAATGT T GAGAAGAGAT CTG AAG CAAC ACAATTGAAA TTGGAGAAGA AGTTTGG A CA 44 4 0 

ATCTCCGGTT TTCGTTGCCT CTGCTGTTCT ACAGAACGGT GGAGTTCCCC GTAACGCAAG 4 50 0 

25 CCCCGCATGT TTG TTAAG AG AAG CCATTCA AG TTATTAG C TGCGGGTACG AAGATAAAAC 4 56 0 

CGAATGGGGA AAAGAGGTAG AAAACATTAC AAAGTTTTTC AACTTCTGAA AACTAGAAAA 462 0 

GTTCTTGTGA TCTCATTCTT G CTG ATAATC ACACGCAGAT CGGGTGGATT TATGGATCGG 46 80 

30 

TGACTGAAGA TATCCTGACG GGTTTCAAGA TGCATTGCCA TGGATGGAGA TCTGTGTACT 4 74 0 

GTATGCCTAA GCGTGCAGCT TTTAAAGGAT CTGCTCCTAT TAACTTGTCA GATCG TCTTC 4 800 

35 ATCAAGTTCT ACGTTGGGCT CTTGGCTCTG TAGAGATTTT CTTGAGCAGA CATTGTCCGA 4 86 0 

TATGGTATGG TTATGGTGGT GGTTTAAAAT GGTTGGAGAG ATTCTCTTAC ATCAACTCTG 4 92 0 

TCGTCTATCC TTGGACTTCA CTTCCATTGA TCGTCTATTG TTCTCTCCCC GCGGTTTGTT 4 98 0 

40 
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TACTCACAGG AAAATTCATC GTCCCTGAG 5009 



5 (2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3603 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

< D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
15 (iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 

(B) STRAIN: Columbia 

20 

(vii) IMMEDIATE SOURCE: 

IB) CLONE: RSW1 cDNA 

(ix) FEATURE: 
25 (A) NAME/KEY: CDS 

(B) LOCATION: 1..3243 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

30 

ATG GAG GCC AGT GCC GGC TTG GTT GCT GGA TCC TAC CGG AGA AAC GAG 4 8 

Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser Tyr Arg Arg Asn Glu 
1 S 10 15 

35 CTC GTT CGG ATC CGA CAT GAA TCT GAT GGC GGG ACC AAA CCT TTG AAG 96 
Leu Val Arg He Arg His Glu Ser Asp Gly Gly Thr Lys Pro Leu Lys 

20 25 30 
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AAT ATG AAT GGC CAG ATA TGT CAG ATC TGT GGT GAT GAT GTT GGA CTC 14 4 

Asn Met: Asn Gly Gin lie Cys Gin lie Cys Gly Asp Asp Val Gly Leu 
35 40 45 

5 GCT GAA ACT GGA GAT GTC TTT GTC GCG TGT AAT GAA TGT GCC TTC CCT 192 
Ala Glu Thr Gly Asp Val Phe Val Ala Cys Asn Glu Cys Ala Phe Pro 
50 55 60 

GTG TGT CGG CCT TGC TAT GAG TAC GAG AGG AAA GAT GGA ACT CAG TGT 2 40 

10 Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Lys Asp Gly Thr Gin Cys 
65 70 75 80 

TGC CCT CAA TGC AAG ACT AGA TTC AGA CGA CAC AGG GGG AGT CCT CGT 2 89 

Cys Pro Gin Cys Lys Thr Arg Phe Arg Arg His Arg Gly Ser Pro Arg 
15 85 90 95 

GTT GAA GGA GAT GAA GAT GAG GAT GAT GTT GAT GAT ATC GAG AAT GAG 3 36 

Val Glu Gly Asp Glu Asp Glu Asp Asp Val Asp Asp lie Glu Asn Glu 

100 105 110 

20 

TTC AAT TAC GCC CAG GGA GCT AAC AAG GCG AGA CAC CAA CGC CAT GGC 3 84 

Phe Asn Tyr Ala Gin Gly Ala Asn Lys Ala Arg His Gin Arg His Gly 
115 120 125 

25 GAA GAG TTT TCT TCT TCC TCT AGA CAT GAA TCT CAA CCA ATT CCT CTT 43 2 

Glu Glu Phe Ser Ser Ser Ser Arg His Glu Ser Gin Pro lie Pro Leu 
130 135 140 

CTC ACC CAT GGC CAT ACG GTT TCT GGA GAG ATT CGC ACG CCT GAT ACA 4 80 

30 Leu Thr His Gly His Thr Val Ser Gly Glu He Arg Thr Pro Asp Thr 
145 150 155 160 

CAA TCT GTG CGA ACT ACA TCA GGT CCT TTG GGT CCT TCT GAC AGG AAT 528 
Gin Ser Val Arg Thr Thr Ser Gly Pro Leu Gly Pro Ser Asp Arg Asn 
35 165 170 175 

GCT ATT TCA TCT CCA TAT ATT GAT CCA CGG CAA CCT GTC CCT GTA AGA 576 

Ala He Ser Ser Pro Tyr He Asp Pro Arg Gin Pro Val Pro Val Arg 

180 185 190 

40 
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ATC GTG GAC CCG TCA AAA GAC TTG AAC TCT TAT GGG CTT GGT AAT GTT 62 4 

He Val Asp Pro Ser Lys Asp Leu Asn Ser Tyr Gly Leu Gly Asn Val 
195 200 205 

5 GAC TGG AAA GAA AGA GTT GAA GGC TGG AAG CTG AAG CAG GAG AAA AAT 6 72 

Asp Trp Lys Glu Arg Val Glu Gly Trp Lys Leu Lys Gin Glu Lys Asn 
210 215 220 

ATG TTA CAG ATG ACT GGT AAA TAC CAT GAA GGG AAA GGA GGA GAA ATT 72 0 

10 Met Leu Gin Met Thr Gly Lys Tyr His Glu Gly Lys Gly Gly Glu He 
225 230 235 240 

GAA GGG ACT GGT TCC AAT GGC GAA GAA CTC CAA ATG GCT GAT GAT ACA 768 
Glu Gly Thr Gly Ser Asn Gly Glu Glu Leu Gin Met Ala Asp Asp Thr 
15 245 250 255 

CGT CTT CCT ATG AGT CGT GTG GTG CCT ATC CCA TCT TCT CGC CTA ACC 816 

Arg Leu Pro Met Ser Arg Val Val Pro He Pro Ser Ser Arg Leu Thr 

260 265 270 

20 

CCT TAT CGG GTT GTG ATT ATT CTC CGG CTT ATC ATC TTG TGT TTC TTC 864 

Pro Tyr Arg Val Val He He Leu Arg Leu He He Leu Cys Phe Phe 
275 280 265 

25 TTG CAA TAT CGT ACA ACT CAC CCT GTG AAA AAT GCA TAT CCT TTG TGG 912 
Leu Gin Tyr Arg Thr Thr Hie Pro Val Lys Asn Ala Tyr Pro Leu Trp 
290 295 300 

TTG ACC TCG GTT ATC TGT GAG ATC TGG TTT GCA TTT TCT TGG CTT CTT 960 
30 Leu Thr Ser Val He Cys Glu He Trp Phe Ala Phe Ser Trp Leu Leu 
305 310 315 320 

GAT CAG TTT CCC AAA TGG TAC CCC ATT AAC AGG GAG ACT TAT CTT GAC 1008 
Asp Gin Phe Pro Lys Trp Tyr Pro He Asn Arg Glu Thr Tyr Leu Asp 
35 325 330 335 

CGT CTC GCT ATA AGA TAT GAT CGA GAC GGT GAA CCA TCA CAG CTC GTT 10 56 

Arg Leu Ala He Arg Tyr Asp Arg Asp Gly Glu Pro Ser Gin Leu Val 

340 .345 350 

40 
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CCT GTT GAT GTG TTT GTT AGT ACA GTG GAC CCA TTG AAA GAG CCT CCC 1104 
Pro Val Asp Val Phe Val Ser Thr Val Asp Pro Leu Lys Glu Pro Pro 
355 360 365 

5 CTT GTT ACA GCA AAC ACA GTT CTC TCG ATT CTT TCT GTG GAC TAC CCG 115 2 

Leu Val Thr Ala Asn Thr Val Leu Ser lie Leu Ser Val Asp Tyr Pro 
370 375 380 

GTA GAT AAA GTA GCC TGT TAT GTT TCA GAT GAT GGT TCA GCT ATG CTT 12 00 

10 Val Asp Lys Val Ala Cys Tyr Val Ser Asp Asp Gly Ser Ala Met Leu 
385 390 395 400 

ACC TTT GAA TCC CTT TCT GAA ACC GCT GAG TTT GCA AAG AAA TGG GTA 124 8 

Thr Phe Glu Ser Leu Ser Glu Thr Ala Glu Phe Ala Lys Lys Trp Val 
15 405 410 415 

CCA TTT TGC AAG AAA TTC AAC ATT GAA CCT AGG GCC CCT GAA TTC TAT 12 96 

Pro Phe Cys Lys Lys Phe Asn lie Glu Pro Arg Ala Pro Glu Phe Tyr 

420 425 430 

20 

TTT GCC CAG AAG ATA GAT TAC TTG AAG GAC AAG ATC CAA CCG TCT TTT 13 44 

Phe Ala Gin Lys lie Asp Tyr Leu Lys Asp Lys He Gin Pro Ser Phe 
435 440 445 

25 GTT AAA GAG CGA CGA GCT ATG AAG AGA GAG TAT GAA GAG TTT AAA GTG 13 92 

Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys Val 
450 455 460 

AGG ATA AAT GCT CTT GTT GCC AAA GCA CAG AAA ATC CCT GAA GAA GGC 144 0 

30 Arg He Asn Ala Leu Val Ala Lys Ala Gin Lys He Pro Glu Glu Gly 
465 470 475 480 

TGG ACA ATG CAG GAT GGT ACT CCC TGG CCT GGT AAC AAC ACT AGA GAT 14 88 

Trp Thr Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Arg Asp 
35 485 490 495 

CAT CCT GGA ATG ATA CAG GTG TTC TTA GGC CAT AGT GGG GGT CTG GAT 153 6 

His Pro Gly Met He Gin Val Phe Leu Gly His Ser Gly Gly Leu Asp 

500 505 510 

40 
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ACC GAT GGA AAT GAG CTG CCT AGA CTC ATC TAT GTT TCT CGT GAA AAG 15 84 

Thr Asp Gly Asn Glu Leu Pro Arg Leu He Tyr Val Ser Arg Glu Lys 
515 520 525 

5 CGG CCT GGA TTT CAA CAC CAC AAA AAG GCT GGA GCT ATG AAT GCA TTG 163 2 

Arg Pro Gly Phe Gin His His Lys Lys Ala Gly Ala Met Asn Ala Leu 
530 535 540 

ATC CGT GTA TCT GCT GTT CTT ACC AAT GGA GCA TAT CTT TTG AAC GTG 16 8 0 

10 He Arg Val Ser Ala Val Leu Thr Asn Gly Ala Tyr Leu Leu Asn Val 
545 550 555 560 

GAT TGT GAT CAT TAC TTT AAT AAC AGT AAG GCT ATT AAA GAA GCT ATG 172 6 

Asp Cys Asp His Tyr Phe Asn Asn Ser Lys Ala He Lys Glu Ala Met 

15 565 570 575 

TGT TTC ATG ATG GAC CCG GCT ATT GGA AAG AAG TGC TGC TAT GTC CAG 1776 

Cys Phe Met Met Asp Pro Ala He Gly Lys Lys Cys Cys Tyr Val Gin 

580 585 590 

20 

TTC CCT CAA CGT TTT GAC GGT ATT GAT TTG CAC GAT CGA TAT GCC AAC 1824 

Phe Pro Gin Arg Phe Asp Gly He Asp Leu His Asp Arg Tyr Ala Asn 
595 600 605 

25 AGG AAT ATA GTC TTT TTC GAT ATT AAC ATG AAG GGG TTG GAT GGT ATC 1872 

Arg Asn He Val Phe Phe Asp He Asn Met Lys Gly Leu Asp Gly He 
610 615 620 

CAG GGT CCA, GTA TAT GTG GGT ACT GGT TGT TGT TTT AAT AGG CAG GCT 192 0 

30 Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Cys Phe Asn Arg Gin Ala 
625 630 635 640 

CTA TAT GGG TAT GAT CCT GTT TTG ACG GAA GAA GAT TTA GAA CCA AAT 196 8 

Leu Tyr Gly Tyr Asp Pro Val Leu Thr Glu Glu Asp Leu Glu Pro Asn 

35 645 650 655 

ATT ATT GTC AAG AGC TGT TGC GGG TCA AGG AAG AAA GGT AAA AGT AGC 2 016 

He He Val Lys Ser Cys Cys Gly Ser Arg Lys Lys Gly Lys Ser Ser 

660 665 670 

40 
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AAG AAG TAT AAC TAC GAA AAG AGG AGA GGC ATC AAC AGA AGT GAC TCC 2 06 4 

Lys Lys Tyr Asn Tyr Glu Lys Arg Arg Gly lie Asn Arg Ser Asp Ser 
675 680 685 

5 AAT GCT CCA CTT TTC AAT ATG GAG GAC ATC GAT GAG GGT TTT GAA GGT 2112 
Asn Ala Pro Leu Phe Asn Met Glu Asp lie Asp Glu Gly Phe Glu Gly 
690 695 700 

TAT GAT GAT GAG AGG TCT ATT CTA ATG TCC CAG AGG AGT GTA GAG AAG 216 0 

10 Tyr Asp Asp Glu Arg Ser lie Leu Met Ser Gin Arg Ser Val Glu Lys 
705 710 715 720 

CGT TTT GGT CAG TCG CCG GTA TTT ATT GCG GCA ACC TTC ATG GAA CAA 2 2 08 

Arg Phe Gly Gin Ser Pro Val Phe He Ala Ala Thr Phe Met Glu Gin 
15 725 730 735 

GGC GGC ATT CCA CCA ACA ACC AAT CCC GCT ACT CTT CTG AAG GAG GCT 2 2 56 

Gly Gly He Pro Pro Thr Thr Asn Pro Ala Thr Leu Leu Lys Glu Ala 

740 745 750 

20 

ATT CAT GTT ATA AGC TGT GGT TAC GAA GAC AAG ACT GAA TGG GGC AAA 23 04 

He His Val He Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp Gly Lys 
755 760 765 

25 GAG ATT GGT TGG ATC TAT GGT TCC GTG ACG GAA GAT ATT CTT ACT GGG 23 52 

Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp He Leu Thr Gly 
770 775 780 

TTC AAG ATG CAT GCC CGG GGT TGG ATA TCG ATC TAC TGC AAT CCT CCA 24 00 

30 Phe Lys Met His Ala Arg Gly Trp He Ser He Tyr Cys Asn Pro Pro 
785 790 795 800 

CGC CCT GCG TTC AAG GGA TCT GCA CCA ATC AAT CTT TCT GAT CGT TTG 244 8 

Arg Pro Ala Phe Lys Gly Ser Ala Pro He Asn Leu Ser Asp Arg Leu 
35 805 810 815 

AAC CAA GTT CTT CGA TGG GCT TTG GGA TCT ATC GAG ATT CTT CTT AGC 24 96 

Asn Gin Val Leu Arg Trp Ala Leu Gly Ser He Glu He Leu Leu Ser 

820 825 830 

40 
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AG A CAT TGT CCT ATC TGG TAT GGT TAC CAT GGA AGG TTG AG A CTT TTG 2 54 4 

Arg His Cys Pro lie Trp Tyr Gly Tyr His Gly Arg Leu Arg Leu Leu 
835 840 845 

5 GAG AGG ATC GCT TAT ATC AAC ACC ATC GTC TAT CCT ATT ACA TCC ATC 2 592 

Glu Arg lie Ala Tyr lie Asn Thr lie Val Tyr Pro He Thr Ser He 
850 855 860 

CCT CTT ATT GCG TAT TGT ATT CTT CCC GCT TTT TGT CTC ATC ACC GAC 264 0 

10 Pro Leu He Ala Tyr Cys He Leu Pro Ala Phe Cys Leu He Thr Asp 
865 870 875 880 

AGA TTC ATC ATA CCC GAG ATA AGC AAC TAC GCG AGT ATT TGG TTC ATT 26 8 8 

Arg Phe He He Pro Glu He Ser Asn Tyr Ala Ser He Trp Phe He 
15 885 890 895 

CTA CTC TTC ATC TCA ATT GCT GTG ACT GGA ATC CTG GAG CTG AGA TGG 2 736 

Leu Leu Phe He Ser He Ala Val Thr Gly He Leu Glu Leu Arg Trp 

900 90S 910 

20 

AGC GGT GTG AGC ATT GAG GAT TGG TGG AGG AAC GAG CAG TTC TGG GTC 2 784 

Ser Gly Val Ser He Glu Asp Trp Trp Arg Asn Glu Gin Phe Trp Val 
915 920 925 

25 ATT GGT GGC ACA TCC GCC CAT CTT TTT GCT GTC TTC CAA GGT CTA CTT 2 83 2 

He Gly Gly Thr Ser Ala His Leu Phe Ala Val Phe Gin Gly Leu Leu 
930 935 940 

AAG GTT CTT GCT GGT ATC GAC ACC AAC TTC ACC GTT ACA TCT AAA GCC 2 8 80 

30 Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr Ser Lys Ala 
945 950 955 960 

ACA GAC GAA GAT GGG GAT TTT GCA GAA CTC TAC ATC TTC AAA TGG ACA 2 92 8 

Thr Asp Glu Asp Gly Asp Phe Ala Glu Leu Tyr He Phe Lys Trp Thr 
35 965 970 975 

GCT CTT CTC ATT CCA CCA ACC ACC GTC CTA CTT GTG AAC CTC ATA GGC 2 976 

Ala Leu Leu He Pro Pro Thr Thr Val Leu Leu Val Asn Leu He Gly 

980 985 990 

40 
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ATT GTG GCT GGT GTC TCT TAT GCT GTA AAC AGT GGC TAC CAG TCG TGG 3 024 

He Val Ala Gly Val Ser Tyr Ala Val Asn Ser Gly Tyr Gin Ser Trp 
995 1000 1005 

5 GGT CCG CTT TTC GGG AAG CTC TTC TTC GCC TTA TGG GTT ATT GCC CAT 3 0 72 

Gly Pro Leu Phe Gly Lys Leu Phe Phe Ala Leu Trp Val He Ala His 
1010 1015 1020 

CTC TAC CCT TTC TTG AAA GGT CTG TTG GGA AGA CAA AAC CGA ACA CCA 3120 
10 Leu Tyr Pro Phe Leu Lys Gly Leu Leu Gly Arg Gin Asn Arg Thr Pro 
1025 1030 1035 1040 

ACC ATC GTC ATT GTC TGG TCT GTT CTT CTC GCC TCC ATC TTC TCG TTG 3168 
Thr He Val He Val Trp Ser Val Leu Leu Ala Ser He Phe Ser Leu 
15 1045 1050 1055 

CTT TGG GTC AGG ATC AAT CCC TTT GTG GAC GCC AAT CCC AAT GCC AAC 3 216 

Leu Trp Val Arg He Asn Pro Phe Val Asp Ala Asn Pro Asn Ala Asn 

1060 1065 1070 

20 

AAC TTC AAT GGC AAA GGA GGT GTC TTT TAGACCCTAT TTATATACTT 3 263 

Asn Phe Asn Gly Lys Gly Gly Val Phe 
1075 1080 

25 GTGTGTGCAT ATAT CAAAAA CGCGCAATGG G AATT C C AAA TCATCTAAAC CCATCAAACC 3 323 

CCAGTGAACC GGGCAGTTAA GGTGATTCCA TGTCCAAGAT TAGCTTTCTC CGAGTAGCCA 3 3 83 

GAG AAG GTG A AATTGTTCGT AACACTATTG TAATGATTTT CCAGTGGGGA AGAAGATGTG 344 3 

30 

GACCCAAATG ATACATAGTC TACAAAAAGA ATTTGTTATT CTTTCTTATA TTTATTTTAT 3S03 

TTAAAGCTTG TTAGACTCAC ACTTATGTAA TGTTGGAACT TGTTGTCCTA AAAAGGGATT 356 3 

35 GGAGTTTTCT TTTTATCTAA G AAT CTG AAG TTTATATGCT 3603 



(2) INFORMATION FOR SEQ ID NO:6: 
40 (i) SEQUENCE CHARACTERISTICS: 
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(Al LENGTH: 1081 amino acids 
(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser Tyr Arg Arg Asn Glu 
10 1 5 10 15 

Leu Val Arg lie Arg His Glu Ser Asp Gly Gly Thr Lys Pro Leu Lys 

20 25 30 

15 Asn Met Asn Gly Gin lie Cys Gin lie Cys Gly Asp Asp Val Gly Leu 

35 40 45 

Ala Glu Thr Gly Asp Val Phe Val Ala Cys Asn Glu Cys Ala Phe Pro 
50 55 60 

20 

Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Lys Asp Gly Thr Gin Cys 
65 70 75 80 

Cys Pro Gin Cys Lys Thr Arg Phe Arg Arg His Arg Gly Ser Pro Arg 
25 85 90 95 

Val Glu Gly Asp Glu Asp Glu Asp Asp Val Asp Asp lie Glu Asn Glu 

100 105 110 

30 Phe Asn Tyr Ala Gin Gly Ala Asn Lys Ala Arg His Gin Arg His Gly 

115 120 125 

Glu Glu Phe Ser Ser Ser Ser Arg His Glu Ser Gin Pro lie Pro Leu 
130 135 140 

35 

Leu Thr His Gly His Thr Val Ser Gly Glu lie Arg Thr Pro Asp Thr 
145 150 155 160 



Gin Ser Val Arg Thr Thr Ser Gly Pro Leu Gly Pro Ser Asp Arg Asn 
40 165 170 175 
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Ala He Ser Ser Pro 

180 

He Val Asp Pro Ser 
5 195 

Asp Trp Lys Glu Arg 
210 

10 Met Leu Gin Met Thr 
225 

Glu Gly Thr Gly Ser 

245 

15 

Arg Leu Pro Met Ser 

260 

Pro Tyr Arg Val Val 
20 275 

Leu Gin Tyr Arg Thr 
2 90 

25 Leu Thr Ser Val He 
305 

Asp Gin Phe Pro Lys 

325 

30 

Arg Leu Ala He Arg 

340 

Pro Val Asp Val Phe 
35 355 

Leu Val Thr Ala Asn 
370 
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Tyr He Asp Pro Arg 

185 

Lys Asp Leu Asn Ser 
200 

Val Glu Gly Trp Lys 
215 

Gly Lys Tyr His Glu 
230 

Asn Gly Glu Glu Leu 

250 

Arg Val Val Pro He 

265 

He He Leu Arg Leu 
280 

Thr His Pro Val Lys 
295 

Cys Glu lie Trp Phe 
310 

Trp Tyr Pro He Asn 

330 

Tyr Asp Arg Asp Gly 

345 

Val Ser Thr Val Asp 
360 

Thr Val Leu Ser He 
375 



n Pro Val Pro Val Arg 
190 

Tyr Gly Leu Gly Asn Val 
205 

Leu Lys Gin Glu Lys Asn 
220 

Gly Lys Gly Gly Glu He 
235 240 

Gin Met Ala Asp Asp Thr 

255 

Pro Ser Ser Arg Leu Thr 

270 

He He Leu Cys Phe Phe 
285 

Asn Ala Tyr Pro Leu Trp 
300 

Ala Phe Ser Trp Leu Leu 
315 320 

Arg Glu Thr Tyr Leu Asp 

33 5 

Glu Pro Ser Gin Leu Val 

350 

Pro Leu Lys Glu Pro Pro 
365 

Leu Ser Val Asp Tyr Pro 
380 
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Val Asp Lys Val Ala Cys Tyr Val Ser Asp Asp Gly Ser Ala Met Leu 
385 390 39S 400 

Thr Phe Glu Ser Leu Ser Glu Thr Ala Glu Phe Ala Lys Lys Trp Val 
5 405 410 415 

Pro Phe Cys Lys Lys Phe Asn lie Glu Pro Arg Ala Pro Glu Phe Tyr 

420 42S 430 

10 Phe Ala Gin Lys lie Asp Tyr Leu Lys Asp Lys lie Gin Pro Ser Phe 

435 440 445 

Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys Val 
450 455 460 

15 

Arg lie Asn Ala Leu Val Ala Lys Ala Gin Lys lie Pro Glu Glu Gly 
465 470 475 480 

Trp Thr Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Arg Asp 
20 485 490 495 

His Pro Gly Met lie Gin Val Phe Leu Gly His Ser Gly Gly Leu Asp 

500 505 510 

25 Thr Asp Gly Asn Glu Leu Pro Arg Leu lie Tyr Val Ser Arg Glu Lys 

515 520 525 

Arg Pro Gly Phe Gin His His Lys Lys Ala Gly Ala Met Asn Ala Leu 
530 535 540 

30 

He Arg Val Ser Ala Val Leu Thr Asn Gly Ala Tyr Leu Leu Asn Val 
545 550 555 560 

Asp Cys Asp His Tyr Phe Asn Asn Ser Lys Ala He Lys Glu Ala Met 
35 565 570 575 

Cys Phe Met Met Asp Pro Ala He Gly Lys Lys Cys Cys Tyr Val Gin 

580 585 590 
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Phe Pro Gin Arg Phe Asp Gly lie Asp Leu His Asp Arg Tyr Ala Asn 
595 600 605 

Arg Asn lie Val Phe Phe Asp lie Asn Met Lys Gly Leu Asp Gly lie 
5 610 615 620 

Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Cys Phe Asn Arg Gin Ala 
625 630 635 640 

10 Leu Tyr Gly Tyr Asp Pro Val Leu Thr Glu Glu Asp Leu Glu Pro Asn 

645 650 655 

lie He Val Lys Ser Cys Cys Gly Ser Arg Lys Lys Gly Lys Ser Ser 

660 665 670 

15 

Lys Lys Tyr Asn Tyr Glu Lys Arg Arg Gly He Asn Arg Ser Asp Ser 
675 680 685 

Asn Ala Pro Leu Phe Asn Met Glu Asp He Asp Glu Gly Phe Glu Gly 
20 690 695 700 

Tyr Asp Asp Glu Arg Ser He Leu Met Ser Gin Arg Ser Val Glu Lys 
705 710 715 720 

25 Arg Phe Gly Gin Ser Pro Val Phe He Ala Ala Thr Phe Met Glu Gin 

725 730 735 

Gly Gly He Pro Pro Thr Thr Asn Pro Ala Thr Leu Leu Lys Glu Ala 

740 745 750 

30 

He His Val He Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp Gly Lys 
755 760 765 

Glu He Gly Trp lie Tyr Gly Ser Val Thr Glu Asp lie Leu Thr Gly 
35 770 775 780 

Phe Lys Met His Ala Arg Gly Trp lie Ser He Tyr Cys Asn Pro Pro 
785 790 795 800 
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Arg Pro Ala Phe Lys Gly Ser Ala Pro lie Asn Leu Ser Asp Arg Leu 

805 810 815 

Asn Gin Val Leu Arg Trp Ala Leu Gly Ser lie Glu lie Leu Leu Ser 
5 820 825 830 

Arg His Cys Pro He Trp Tyr Gly Tyr His Gly Arg Leu Arg Leu Leu 
835 840 845 

10 Glu Arg He Ala Tyr He Asn Thr He Val Tyr Pro He Thr Ser He 
B50 855 860 

Pro Leu He Ala Tyr Cys He Leu Pro Ala Phe Cys Leu He Thr Asp 
865 870 875 880 

15 

Arg Phe He lie Pro Glu He Ser Asn Tyr Ala Ser lie Trp Phe He 

885 890 895 

Leu Leu Phe He Ser He Ala Val Thr Gly He Leu Glu Leu Arg Trp 
20 900 905 910 

Ser Gly Val Ser He Glu Asp Trp Trp Arg Asn Glu Gin Phe Trp Val 
915 920 925 

25 He Gly Gly Thr Ser Ala His Leu Phe Ala Val Phe Gin Gly Leu Leu 
930 935 940 

Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr Ser Lys Ala 
945 950 955 960 

30 

Thr Asp Glu Asp Gly Asp Phe Ala Glu Leu Tyr He Phe Lys Trp Thr 

965 970 975 

Ala Leu Leu He Pro Pro Thr Thr Val Leu Leu Val Asn Leu He Gly 
35 980 985 990 



He Val Ala Gly Val Ser Tyr Ala Val Asn Ser Gly Tyr Gin Ser Trp 
995 1000 1005 
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Gly Pro Leu Phe Gly Lys Leu Phe Phe Ala Leu Trp Val lie Ala His 
1010 101S 1020 

Leu Tyr Pro Phe Leu Lys Gly Leu Leu Gly Arg Gin Asn Arg Thr Pro 
5 1025 1030 1035 1040 

Thr lie Val lie Val Trp Ser Val Leu Leu Ala Ser He Phe Ser Leu 

1045 1050 1055 

10 Leu Trp Val Arg He Asn Pro Phe Val Asp Ala Asn Pro Asn Ala Asn 

1060 1065 1070 

Asn Phe Asn Gly Lys Gly Gly Val Phe 
1075 1080 

15 



(2) INFORMATION FOR SEQ ID NO: 7: 

20 ti) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 82B base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



25 



35 



<ii) MOLECULE TYPE : cDNA 



(iii) HYPOTHETICAL: NO 



30 <iy} ANTI -SENSE: NO 

(vi> ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 
IB) STRAIN: Columbia 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: Ath-A 



(ix) FEATURE: 
40 (A) NAME /KEY : CDS 
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(3) LOCATION: 23 9 . .3490 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO:7: 

5 

GTCGACACTA AGTGGATCCA AAGAATTCGC GGCCGCGTCG ATACGGCTGC GAGAAGACGA 6 0 

CAGAAGGGGA TTGTCGATTC GGTTTATTTC GTCTCCTTCG TCTTCCACTC TTACTAGTGC 120 

10 ATG CTCTGAA TCTGTATGTA ATGGGAGTTC AACAGTCTGG ATCCATTATC CTAGCCGGGT 180 

CGGGTCAAGG TCTTTGAATA AGAGAGACAA TTCGTTTTGA TTCGGTGTAG AAGACATC 23 8 

ATG AAT ACT GGT GGT CGG CTC ATT GCT GGC TCT CAC AAC AGA AAC GAA 286 
15 Met Asn Thr Gly Gly Arg Leu lie Ala Gly Ser His Asn Arg Asn Glu 
15 10 15 

TTC GTT CTC ATT AAC GCC GAT GAG AGT GCC AGA ATA CGA TCA GTA CAA 334 
Phe Val Leu He Asn Ala Asp Glu Ser Ala Arg He Arg Ser Val Gin 
20 20 25 30 

GAA CTG AGT GGG CAA ACA TGT CAA ATC TGT GGA GAT GAA ATC GAA TTA 3 82 

Glu Leu Ser Gly Gin Thr Cys Gin He Cys Gly Asp Glu He Glu Leu 

35 40 45 

25 

ACG GTT AGC AGT GAG CTC TTT GTT GCT TGC AAC GAA TGC GCA TTC CCG 430 

Thr Val Ser Ser Glu Leu Phe Val Ala Cys Asn Glu Cys Ala Phe Pro 

50 55 60 

30 GTT TGT AGA CCA TGC TAT GAG TAT GAA CGT AGA GAA GGA AAT CAA GCT 478 
Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Arg Glu Gly Asn Gin Ala 
65 70 75 80 

TGT CCT CAG TGC AAA ACT CGA TAC AAA AGG ATT AAA GGT AGT CCA CGG 52 6 

35. Cys Pro Gin Cys Lys Thr Arg Tyr Lys Arg He Lys Gly Ser Pro Arg 

85 " 90 95 

GTT GAT GGA GAT GAT GAA GAA GAA GAA GAC ATT GAT GAT CTT GAG TAT 5 74 

Val Asp Gly Asp Asp Glu Glu Glu Glu Asp He Asp Asp Leu Glu Tyr 
40 100 105 110 
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GAG TTT GAT CAT GGG ATG GAC CCT GAA CAT GCC GCT GAA GCC GCA CTC 622 

Glu Phe Asp His Gly Met Asp Pro Glu His Ala Ala Glu Ala Ala Leu 
115 120 125 

5 TCT TCA CGC CTT AAC ACC GGT CGT GGT GGA TTG GAT TCA GCT CCA CCT 6 70 

Ser Ser Arg Leu Asn Thr Gly Arg Gly Gly Leu Asp Ser Ala Pro Pro 

130 135 140 



GGC TCT CAG ATT CCT CTT TTG ACT TAT TGT GAT GAA GAT GCT GAT ATG 718 

10 Gly Ser Gin He Pro Leu Leu Thr Tyr Cys Asp Glu Asp Ala Asp Met 
145 150 155 160 

TAT TCT GAT CGT CAT GCT CTT ATC GTG CCT CCT TCA ACG GGA TAT GGG 76 6 

Tyr Ser Asp Arg His Ala Leu He Val Pro Pro Ser Thr Gly Tyr Gly 

15 165 170 175 

AAT CGC GTC TAT CCT GCA CCG TTT ACA GAT TCT TCT GCA CCT CCA CAG 814 

Asn Arg Val Tyr Pro Ala Pro Phe Thr Asp Ser Ser Ala Pro Pro Gin 

180 185 190 

20 

GCG AGA TCA ATG GTT CCT CAG AAA GAT ATT GCG GAA TAT GGT TAT GGA 86 2 

Ala Arg Ser Met Val Pro Gin Lys Asp lie Ala Glu Tyr Gly Tyr Gly 
195 200 205 



25 AGT GTT GCT TGG 

Ser Val Ala Trp 
210 

GAA AAG CTT CAA 

30 gi u Lys Leu Gin 
225 

TCC AAT GAT GAC 
Ser Asn Asp Asp 

35 

GAA GGA AGA CAA 
Glu Gly Arg Gin 

260 

40 



AAG GAC CGT ATG GAA 
Lys Asp Arg Met Glu 
215 

GTC ATT AAG CAT GAA 
Val He Lys His Glu 
230 

GAC GAA CTA GAT GAT 
Asp Glu Leu Asp Asp 
245 

CCT CTC TCA AGA AAG 
Pro Leu Ser Arg Lys 

265 



GTT TGG AAG AGA CGA 
Val Trp Lys Arg Arg 
220 

GGA GGA AAC AAT GGT 
Gly Gly Asn Asn Gly 
235 

CCT GAC ATG CCT ATG 
Pro Asp Met Pro Met 
250 

CTA CCT ATT CGT TCA 
Leu Pro He Arg Ser 

270 



CAA GGC 910 
Gin Gly 

CGA GGT 95 8 

Arg Gly 

240 

ATG GAT 1006 

Met Asp 

255 

AGC AGA 1054 
Ser Arg 
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ATA AAT CCT TAC AGG ATG TTA ATT CTG TGT CGC CTC GCG ATT CTT GGT 1102 

lie Asn Pro Tyr Arg Met Leu lie Leu Cys Arg Leu Ala lie Leu Gly 

275 280 285 

5 CTT TTC TTT CAT TAT AGA ATT CTC CAT CCA GTC AAT GAT GCA TAT GGA 1150 

Leu Phe Phe His Tyr Arg lie Leu His Pro Val Asn Asp Ala Tyr Gly 
290 29S 300 

TTA TGG TTA ACG TCA GTT ATA TGC GAA ATA TGG TTT GCA GTG TCT TGG 119 8 

10 Leu Trp Leu Thr Ser Val lie Cys Glu He Trp Phe Ala Val Ser Trp 

305 310 315 320 

ATT CTT GAT CAA TTC CCC AAA TGG TAT CCT ATA GAA CGT GAA ACA TAC 12 4 6 

He Leu Asp Gin Phe Pro Lys Trp Tyr Pro He Glu Arg Glu Thr Tyr 
15 325 330 335 

CTC GAT AGA CTC TCT CTC AGG TAC GAG AAG GAA GGA AAA CCG TCA GGA 12 94 

Leu Asp Arg Leu Ser Leu Arg Tyr Glu Lys Glu Gly Lys Pro Ser Gly 

340 345 350 

20 

TTA GCA CCT GTT GAT GTT TTT GTT AGT ACA GTG GAT CCG TTG AAA GAG 1342 

Leu Ala Pro Val Asp Val Phe Val Ser Thr Val Asp Pro Leu Lys Glu 

355 360 365 

25 CCC CCC TTG ATT ACA GCA AAC ACA GTT CTT TCC ATT CTA GCA GTT GAT 13 90 

Pro Pro Leu He Thr Ala Asn Thr Val Leu Ser He Leu Ala Val Asp 
370 375 380 

TAT CCT GTG GAT AAG GTT GCG TGT TAT GTA TCA AAC AAT GGT GCA GCT 14 3 8 

30 Tyr Pro Val Asp Lys Val Ala Cys Tyr Val Ser Asn Asn Gly Ala Ala 
385 390 395 400 

ATG CTT ACA TTT GAA GCT CTC TCT GAT ACA GCT GAT TTT GCT ACA AAA 14 86 

Met Leu Thr Phe Glu Ala Leu Ser Asp Thr Ala Asp Phe Ala Thr Lys 
35 405 410 415 

TGG GTT CCT TTT TGT AAG AAG TTT AAT ATC GAG CCA CGA GCT CCT GAG 153 4 

Trp Val Pro Phe Cys Lys Lys Phe Asn He Glu Pro Arg Ala Pro Glu 

420 425 430 

40 
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TGG TAT TTT TCT CAG AAG ATG GAT TAC CTG AAG AAC AAA GTT CAT CCT 1582 

Trp Tyr Phe Ser Gin Lys Met Asp Tyr Leu Lys Asn Lys Val His Pro 

435 440 445 

5 GCT TTT GTC AGG GAA CGT CGT GCT ATG AAG AGA GAT TAT GAA GAG TTT 16 3 0 

Ala Phe Val Arg Glu Arg Arg Ala Met Lys Arg Asp Tyr Glu Glu Phe 
450 455 460 

AAA GTG AAG ATA AAT GCA CTG GTT GCT ACT GCA CAG AAA GTG CCT GAG 1676 

10 Lys Val Lys He Asn Ala Leu Val Ala Thr Ala Gin Lys Val Pro Glu 
465 470 475 480 

GAA CGT TGG ACT ATG CAA GAT GGA ACT CCT TGG CCT GGA AAC AAC GTC 172 6 

Glu Arg Trp Thr Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Val 
15 485 490 495 

CGT GAC CAT CCT GGA ATG ATT CAG GTG TTC TTG GGT CAT AGT GGA GTT 1774 

Arg Asp His Pro Gly Met He Gin Val Phe Leu Gly His Ser Gly Val 

500 505 510 

20 

CGT GAT ACG GAT GGT AAT GAG TTA CCA CGT CTA GTG TAT GTT TCT CGT 1822 

Arg Asp Thr Asp Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser Arg 

515 520 525 

25 GAG AAG CGG CCT GGA TTT GAT CAC CAC AAG AAA GCT GGA GCT ATG AAT 1870 

Glu Lys Arg Pro Gly Phe Asp His His Lys Lys Ala Gly Ala Met Asn 
530 535 540 

TCC TTG ATC CGA GTC TCT GCT GTT CTA TCA AAC GCT CCT TAC CTT CTT 1918 

30 Ser Leu He Arg Val Ser Ala Val Leu Ser Asn Ala Pro Tyr Leu Leu 
545 550 555 560 

AAT GTC GAT TGT GAT CAC TAC ATC AAC AAC AGC AAA GCA ATT AGA GAA 1966 

Asn Val Asp Cys Asp His Tyr He Asn Asn Ser Lys Ala He Arg Glu 
35 565 570 575 

TCT ATG TGT TTC ATG ATG GAC CCG CAA TCG GGA AAG AAA GTT TGT TAT 2014 

Ser Met Cys Phe Met Met Asp Pro Gin Ser Gly Lys Lys Val Cys Tyr 

580 585 590 

40 
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GTT CAG TTT CCG CAG AGA TTT GAT GGG ATT GAT AGA CAT GAT AG A TAG 2 062 

Val Gin Phe Pro Gin Arg Phe Asp Gly lie Asp Arg His Asp Arg Tyr 
595 600 605 

5 TCA AAC CGT AAC GTT GTG TTC TTT GAT ATT AAC ATG AAA GGT CTT GAT 2110 
Ser Asn Arg Asn Val Val Phe Phe Asp lie Asn Met Lys Gly Leu Asp 
610 615 620 



GGG ATA CAA GGA CCG ATA TAT GTC GGG ACA GGT TGT GTG TTT AGA AAA 2158 

10 Gly lie Gin Gly Pro lie Tyr Val Gly Thr Gly Cys Val Phe Arg Lys 
625 630 635 640 

CAG GCT CTT TAT GGT TTT GAT GCA CCA AAG AAG AAG AAA CCA CCA GGC 2 2 06 

Gin Ala Leu Tyr Gly Phe Asp Ala Pro Lys Lys Lys Lys Pro Pro Gly 
15 645 650 655 



AAA ACC TGT AAC TGT TGG CCT AAA TGG TGT TGT TTG TGT TGT GGG TTG 2 2 54 

Lys Thr Cys Asn Cys Trp Pro Lys Trp Cys Cys Leu Cys Cys Gly Leu 

660 665 670 

20 

AGA AAG AAG AGT AAA ACG AAA GCC ACA GAT AAG AAA ACT AAC ACT AAA 2302 

Arg Lys Lys Ser Lys Thr Lys Ala Thr Asp Lys Lys Thr Asn Thr Lys 

675 680 685 

25 GAG ACT TCA AAG CAG ATT CAT GCG CTA GAG AAT GTC GAC GAA GGT GTT 2350 
Glu Thr Ser Lys Gin lie His Ala Leu Glu Asn Val Asp Glu Gly Val 
690 695 700 

ATC GTC CCA GTG TCA AAT GTT GAG AAG AGA TCT GAA GCA ACA CAA TTG 23 98 

30 lie Val Pro Val Ser Asn Val Glu Lys Arg Ser Glu Ala Thr Gin Leu 
705 710 715 720 

AAA TTG GAG AAG AAG TTT GGA CAA TCT CCG GTT TTC GTT GCC TCT GCT 244 6 

Lys Leu Glu Lys Lys Phe Gly Gin Ser Pro Val Phe Val Ala Ser Ala 
35 725 730 735 



GTT CTA CAG AAC GGT GGA GTT CCC CGT AAC GCA AGC CCC GCA TGT TTG 2494 
Val Leu Gin Asn Gly Gly Val Pro Arg Asn Ala Ser Pro Ala Cys Leu 

740 745 750 

40 
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TTA AGA GAA GCC ATT CAA GTT ATT AGC TGC GGG TAC CAA GAT AAA ACC 2 542 

Leu Arg Glu Ala He Gin Val He Ser Cys Gly Tyr Gin Asp Lys Thr 
755 760 765 

5 GAA TGG GGA AAA GAG ATC GGG TGG ATT TAT GGA TCG GTG ACT GAA GAT 2 5 90 

Glu Trp Gly Lys Glu He Gly Trp lie Tyr Gly Ser Val Thr Glu Asp 
770 775 780 

ATC CTG ACG GGT TTC AAG ATG CAT TGC CAT GGA TGG AGA TCT GTG TAC 2 63 8 

10 He Leu Thr Gly Phe Lys Met His Cys His Gly Trp Arg Ser Val Tyr 
785 790 795 800 

TGT ATG CCT AAG CGT GCA GCT TTT AAA GGA TCT GCT CCT ATT AAC TTG 26 86 

Cys Met Pro Lys Arg Ala Ala Phe Lys Gly Ser Ala Pro lie Asn Leu 
15 805 810 815 

TCA GAT CGT CTT CAT CAA GTT CTA CGT TGG GCT CTT GGC TCT GTA GAG 2 73 4 

Ser Asp Arg Leu His Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu 

820 825 830 

20 

ATT TTC TTG AGC AGA CAT TGT CCG ATA TGG TAT GGT TAT GGT GGT GGT 2 782 

He Phe Leu Ser Arg His Cys Pro He Trp Tyr Gly Tyr Gly Gly Gly 

835 840 845 

25 TTA AAA TGG TTG GAG AGA TTC TCT TAC ATC AAC TCT GTC GTC TAT CCT 2 83 0 

Leu Lys Trp Leu Glu Arg Phe Ser Tyr He Asn Ser Val Val Tyr Pro 
850 855 860 

TGG ACT TCA CTT CCA TTG ATC GTC TAT TGT TCT CTC CCC GCG GTT TGT 2 87 8 

30 Trp Thr Ser Leu Pro Leu He Val Tyr Cys Ser Leu Pro Ala Val Cys 
865 870 875 880 

TTA CTC ACA GGA AAA TTC ATC GTC CCT GAG ATA AGC AAC TAC GCA GGT 2 926 

Leu Leu Thr Gly Lys Phe He Val Pro Glu He Ser Asn Tyr Ala Gly 
35 885 890 895 

ATA CTC TTC ATG CTC ATG TTC ATA TCC ATA GCA GTA ACT GGA ATC CTC 2974 

He Leu Phe Met Leu Met Phe He Ser He Ala Val Thr Gly He Leu 

900 905 910 

40 
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GAA ATG CAA TGG GGA GGT GTC GGA ATC GAT GAT TGG TGG AGA AAC GAG 3 022 

Glu Met Gin Trp Gly Gly Val Gly lie Asp Asp Trp Trp Arg Asn Glu 
915 920 925 

5 CAG TTT TGG GTA ATC GGA GGG GCC TCC TCG CAT CTA TTT GCT CTG TTT 3 070 

Gin Phe Trp Val lie Gly Gly Ala Ser Ser His Leu Phe Ala Leu Phe 
930 935 940 



CAA GGT TTG CTC AAA GTT CTA GCC GGA GTT AAC ACG AAT TTC ACA GTC 3118 

10 Gin Gly Leu Leu Lys Val Leu Ala Gly Val Asn Thr Asn Phe Thr Val 
945 950 955 960 

ACT TCA AAA GCA GCA GAC GAT GGA GCT TTC TCT GAG CTT TAC ATC TTC 3166 

Thr Ser Lys Ala Ala Asp Asp Gly Ala Phe Ser Glu Leu Tyr lie Phe 

15 965 970 975 



AAG TGG ACA ACT TTG TTG ATT CCT CCG ACA ACA CTT CTG ATC ATT AAC 3 214 

Lys Trp Thr Thr Leu Leu lie Pro Pro Thr Thr Leu Leu lie lie Asn 

980 985 990 

20 

ATC ATT GGA GTT ATT GTC GGC GTT TCT GAT GCC ATT AGC AAT GGC TAT 3 26 2 

lie lie Gly Val lie Val Gly Val Ser Asp Ala He Ser Asn Gly Tyr 
995 1000 1005 



25 GAC TCA TGG GGA CCT CTC TTT GGG AGA CTT TTC TTC GCT CTT TGG GTC 3 310 

Asp Ser Trp Gly Pro Leu Phe Gly Arg Leu Phe Phe Ala Leu Trp Val 
1010 1015 1020 

ATT GTT CAT TTA TAC CCA TTC CTC AAG GGA ATG CTT GGG AAG CAA GAC 3 3 58 

30 lie Val His Leu Tyr Pro Phe Leu Lys Gly Met Leu Gly Lys Gin Asp 

1025 1030 1035 1040 



AAA ATG CCT ACG ATT ATT GTG GTC TGG TCT ATT CTT CTA GCT TCG ATC 34 06 

Lys Met Pro Thr He He Val Val Trp Ser He Leu Leu Ala Ser He 

35 1045 1050 1055 

TTG ACA CTC TTG TGG GTC AGA ATT AAC CCG TTT GTG GCT AAA GGG GGA 3 4 54 

Leu Thr Leu Leu Trp Val Arg He Asn Pro Phe Val Ala Lys Gly Gly 

1060 1065 1070 

40 
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CCA GTG TTG GAG ATC TGT GGT CTG AAT TGT GGA AAC TAAGATCCTC 3 500 

Pro Val Leu Glu lie Cys Gly Leu Asn Cys Gly Asn 
1075 1080 

5 AGTGAAAGAA GAGCAAAGGA GTTTGTGTTG GAGCTTTGGA AG CAAATG TG TTGATGATGA 3S60 

TGCAAGTGTG TTTGTAGACA AAGATGTGCA GTTTTTACTT TTTACGACTT GTTAAACCTT 3 6 20 

TTTTGTTACC CCTAAATTAA TTCTTTTGTT ATCATGGTTA TACTAATAGA ATTGTTTGTT 3 6 80 

10 

TTTCTTTTTT ACATGTACTT TTAGTTATTC CGTAGTTATT GTATAATACT GATAACGATC 3 74 0 

ATATATACAC ACTTTGTTAA CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAGCGGCCG 3 800 

15 CTCGAATTGT CGACGCGGCC GCGAATTC 3 828 



20 (2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1084 amino acids 

(B) TYPE: amino acid 
25 ( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

30 

Met Asn Thr Gly Gly Arg Leu lie Ala Gly Ser His Asn Arg Asn Glu 
15 10 15 

Phe Val Leu He Asn Ala Asp Glu Ser Ala Arg lie Arg Ser Val Gin 
35 20 25 30 

Glu Leu Ser Gly Gin Thr Cys Gin He Cys Gly Asp Glu He Glu Leu 
35 40 45 
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Thr Val Ser Ser Glu Leu Phe Val Ala Cys Asn Glu Cys Ala Phe Pro 
50 55 60 

Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Arg Glu Gly Asn Gin Ala 
5 65 70 75 80 

Cys Pro Gin Cys Lys Thr Arg Tyr Lys Arg lie Lys Gly Ser Pro Arg 

85 90 95 

10 Val Asp Gly Asp Asp Glu Glu Glu Glu Asp lie Asp Asp Leu Glu Tyr 

100 105 110 

Glu Phe Asp His Gly Met Asp Pro Glu His Ala Ala Glu Ala Ala Leu 
115 120 125 

15 

Ser Ser Arg Leu Asn Thr Gly Arg Gly Gly Leu Asp Ser Ala Pro Pro 
130 135 140 

Gly Ser Gin lie Pro Leu Leu Thr Tyr Cys Asp Glu Asp Ala Asp Met 
20 145 150 155 160 

Tyr Ser Asp Arg His Ala Leu He Val Pro Pro Ser Thr Gly Tyr Gly 

165 170 175 

25 Asn Arg Val Tyr Pro Ala Pro Phe Thr Asp Ser Ser Ala Pro Pro Gin 

180 185 190 

Ala Arg Ser Met Val Pro Gin Lys Asp He Ala Glu Tyr Gly Tyr Gly 
195 200 205 

30 

Ser Val Ala Trp Lys Asp Arg Met Glu Val Trp Lys Arg Arg Gin Gly 
210 215 220 



Glu Lys Leu Gin Val He Lys His Glu Gly Gly Asn Asn Gly Arg Gly 

35 225 230 235 240 

Ser Asn Asp Asp Asp Glu Leu Asp Asp Pro Asp Met Pro .Met Met Asp 

245 250 255 
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Glu Gly Arg Gin Pro Leu Ser Arg Lys Leu Pro lie Arg Ser Ser Arg 

260 265 270 

lie Asn Pro Tyr Arg Met Leu He Leu Cys Arg Leu Ala lie Leu Gly 
5 275 280 285 

Leu Phe Phe His Tyr Arg He Leu His Pro Val Asn Asp Ala Tyr Gly 
290 295 300 

10 Leu Trp Leu Thr Ser Val He Cys Glu He Trp Phe Ala Val Ser Trp 
305 310 315 320 

He Leu Asp Gin Phe Pro Lys Trp Tyr Pro He Glu Arg Glu Thr Tyr 

325 330 335 

15 

Leu Asp Arg Leu Ser Leu Arg Tyr Glu Lys Glu Gly Lys Pro Ser Gly, 

340 345 350 

Leu Ala Pro Val Asp Val Phe Val Ser Thr Val Asp Pro Leu Lys Glu 
20 355 360 365 

Pro Pro Leu He Thr Ala Asn Thr Val Leu Ser He Leu Ala Val Asp 
370 375 380 

25 Tyr Pro Val Asp Lys Val Ala Cys Tyr Val Ser Asn Asn Gly Ala Ala 
385 390 395 400 

Met Leu Thr Phe Glu Ala Leu Ser A9p Thr Ala Asp Phe Ala Thr Lys 

405 410 415 

30 

Trp Val Pro Phe Cys Lys Lys Phe Asn He Glu Pro Arg Ala Pro Glu 

420 425 430 

Trp Tyr Phe Ser Gin Lys Met Asp Tyr Leu Lys Asn Lys Val His Pro 
35 435 440 445 

Ala Phe Val Arg Glu Arg Arg Ala Met Lys Arg Asp Tyr Glu Glu Phe 
450 455 460 
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Lys Val Lys lie Asn Ala Leu Val Ala Thr Ala Gin Lys Val Pro Glu 
465 470 475 480 

Glu Arg Trp Thr Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Val 
5 485 490 495 

Arg Asp His Pro Gly Met lie Gin Val Phe Leu Gly His Ser Gly Val 

500 505 510 

10 Arg Asp Thr Asp Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser Arg 

515 520 525 

Glu Lys Arg Pro Gly Phe Asp His His Lys Lys Ala Gly Ala Met Asn 
530 535 540 

15 

Ser Leu lie Arg Val Ser Ala Val Leu Ser Asn Ala Pro Tyr Leu Leu 
545 550 555 560 

. Asn Val Asp Cys Asp His Tyr lie Asn Asn Ser Lys Ala lie Arg Glu 

20 565 570 575 

Ser Met Cys Phe Met Met Asp Pro Gin Ser Gly Lys Lys Val Cys Tyr 

580 585 590 

25 Val Gin Phe Pro Gin Arg Phe Asp Gly lie Asp Arg His Asp Arg Tyr 

595 600 605 

Ser Asn Arg Asn Val Val Phe Phe Asp lie Asn Met Lys Gly Leu Asp 
610 615 620 

30 

Gly lie Gin Gly Pro lie Tyr Val Gly Thr Gly Cys Val Phe Arg Lys 
625 630 635 640 

Gin Ala Leu Tyr Gly Phe Asp Ala Pro Lys Lys Lys Lys Pro Pro Gly 
35 645 650 655 



Lys Thr Cys Asn Cys Trp Pro Lys Trp Cys Cys Leu Cys Cys Gly Leu 

660 665 670 
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Arg Lys Lys Ser Lys Thr Lys Ala Thr Asp Lys Lys Thr Asn Thr Lys 
675 680 685 

Glu Thr Ser Lys Gin lie His Ala Leu Glu Asn Val Asp Glu Gly Val 
5 690 695 700 

He Val Pro Val Ser Asn Val Glu Lys Arg Ser Glu Ala Thr Gin Leu 
705 710 715 720 

10 Lys Leu Glu Lys Lys Phe Gly Gin Ser Pro Val Phe Val Ala Ser Ala 

725 730 735 

Val Leu Gin Asn Gly Gly Val Pro Arg Asn Ala Ser Pro Ala Cys Leu 

740 745 750 

15 

Leu Arg Glu Ala He Gin Val lie Ser Cys Gly Tyr Gin Asp Lys Thr 
755 760 765 

Glu Trp Gly Lys Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp 
20 770 775 780 

He Leu Thr Gly Phe Lys Met His Cys His Gly Trp Arg Ser Val Tyr 
785 790 795 800 

25 Cys Met Pro Lys Arg Ala Ala Phe Lys Gly Ser Ala Pro He Asn Leu 

805 810 815 

Ser Asp Arg Leu His Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu 

820 825 830 

30 

He Phe Leu Ser Arg His Cys Pro He Trp Tyr Gly Tyr Gly Gly Gly 
835 840 845 

Leu Lys Trp Leu Glu Arg Phe Ser Tyr He Asn Ser Val Val Tyr Pro 
35 850 855 860 

Trp Thr Ser Leu Pro Leu He Val Tyr Cys Ser Leu Pro Ala Val Cys 
865 870 875 880 
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Leu Leu Thr Gly Lys Phe lie Val Pro Glu lie Ser Asn Tyr Ala Gly 

885 B90 895 

lie Leu Phe Met Leu Met Phe lie Ser lie Ala Val Thr Gly lie Leu 

900 905 910 

Glu Met Gin Trp Gly Gly Val Gly lie Asp Asp Trp Trp Arg Asn Glu 
915 920 925 

Gin Phe Trp Val He Gly Gly Ala Ser Ser His Leu Phe Ala Leu Phe 
930 935 940 

Gin Gly Leu Leu Lys Val Leu Ala Gly Val Asn Thr Asn Phe Thr Val 
945 950 955 960 

Thr Ser Lys Ala Ala Asp Asp Gly Ala Phe Ser Glu Leu Tyr lie Phe 

965 970 975 

Lys Trp Thr Thr Leu Leu He Pro Pro Thr Thr Leu Leu He He Asn 

980 985 990 

He He Gly Val He Val Gly Val Ser Asp Ala He Ser Asn Gly Tyr 
995 1000 1005 

Asp Ser Trp Gly Pro Leu Phe Gly Arg Leu Phe Phe Ala Leu Trp Val 
1010 1015 1020 

He Val His Leu Tyr Pro Phe Leu Lys Gly Met Leu Gly Lys Gin Asp 
1025 1030 1035 1040 

Lys Met Pro Thr He He Val Val Trp Ser He Leu Leu Ala Ser He 

1045 1050 1055 

Leu Thr Leu Leu Trp Val Arg He Asn Pro Phe Val Ala Lys Gly Gly 

1060 1065 1070 



Pro Val Leu Glu He Cys Gly Leu Asn Cys Gly Asn 
1075 1080 
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10 



(2) INFORMATION FOR SEQ ID NO : 9 : 

( i ) SEQUENCE CHARACTER ISTICS : 

(A) LENGTH: 3614 base pairs 

(B) TYPE: nucleic acid 
(Ci STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
< i i i ? HYPOTHETICAL: NO 



15 



(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Arabidopsis thaliana 

(B) STRAIN: Columbia 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: Ath-B 



20 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 217.. 3411 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 



GAATTCGCGG CCGCGTCGAC TACGGCTGCG AGAAGACGAC AGAAGGGGAT CCCAAGATTC 



60 



30 



TCCTCTTCGT CTTCCTTATA AACTATCTCT CTGTAGAGAA GAAAGCTTGG ATCCAGATTG 



AGAGAGATTC AGAGAGCCAC ATCACCACAC TCCATCTTCA GATCTCATGA TTTGAACTAT 



120 



180 



TCCGACGTTT CGGTGTTGGA AG CAA CTAAG TGACAA ATG GAA TCC GAA GGA GAA 

Met Glu Ser Glu Gly Glu 

35 i 5 



234 



ACC GCG GGA AAG CCG ATG AAG AAC ATT GTT CCG CAG ACT TGC CAG ATC 
Thr Ala Gly Lys Pro Met Lys Asn lie Val Pro Gin Thr Cys Gin lie 

10 15 20 



40 



282 
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TGT AGT GAC AAT GTT GGC AAG ACT GTT GAT GGA GAT CGT TTT GTG GCT 330 
Cys Ser Asp Asn Val Gly Lys Thr Val Asp Gly Asp Arg Phe Val Ala 
25 30 35 

5 TGT GAT ATT TGT TCA TTC CCA GTT TGT CGG CCT TGC TAC GAG TAT GAG 3 78 

Cys Asp lie Cys Ser Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr Glu 
40 45 50 

AGG AAA GAT GGG AAT CAA TCT TGT CCT GAG TGC AAA ACC AGA TAC AAG 426 
10 Arg Lys Asp Gly Asn Gin Ser Cys Pro Gin Cys Lys Thr Arg Tyr Lys 
55 60 65 70 

AGG CTC AAA GGT AGT CCT GCT ATT CCT GGT GAT AAA GAC GAG GAT GGC 4 74 

Arg Leu Lys Gly Ser Pro Ala lie Pro Gly Asp Lys Asp Glu Asp Gly 
15 75 80 85 

TTA GCT GAT GAA GGT ACT GTT GAG TTC AAC TAC CCT CAG AAG GAG AAA 522 

Leu Ala Asp Glu Gly Thr Val Glu Phe Asn Tyr Pro Gin Lys Glu Lys 

90 95 100 

20 

ATT TCA GAG CGG ATG CTT GGT TGG CAT CTT ACT CGT GGG AAG GGA GAG 570 

lie Ser Glu Arg Met Leu Gly Trp His Leu Thr Arg Gly Lys Gly Glu 
105 110 115 

25 GAA ATG GGG GAA CCC CAG TAT GAT AAA GAG GTC TCT CAC AAT CAT CTT 618 
Glu Met Gly Glu Pro Gin Tyr Asp Lys Glu Val Ser His Asn His Leu 
120 125 130 

CCT CGT CTC ACG AGC AGA CAA GAT ACT TCA GGA GAG TTT TCT GCT GCC 666 
30 Pro Arg Leu Thr Ser Arg Gin Asp Thr Ser Gly Glu Phe Ser Ala Ala 
135 140 145 150 

TCA CCT GAA CGC CTC TCT GTA TCT TCT ACT ATC GCT GGG GGA AAG CGC 714 
Ser Pro Glu Arg Leu Ser Val Ser Ser Thr lie Ala Gly Gly Lys Arg 
35 155 160 165 

CTT CCC TAT TCA TCA GAT GTC AAT CAA TCA CCA AAT AGA AGG ATT GTG 762 
Leu Pro Tyr Ser Ser Asp Val Asn Gin Ser Pro Asn Arg Arg lie Val 

170 175 180 

40 
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GAT CCT GTT GGA CTC GGG AAT GTA GCT TGG AAG GAG AGA GTT GAT GGC 810 

Asp Pro Val Gly Leu Gly Asn Val Ala Trp Lys Glu Arg Val Asp Gly 
185 190 195 

5 TGG AAA ATG AAG CAA GAG AAG AAT ACT GGT CCT GTC AGC ACG CAG GCT 8 58 

Trp Lys Met Lys Gin Glu Lys Asn Thr Gly Pro Val Ser Thr Gin Ala 
200 205 210 

GCT TCT GAA AGA GGT GGA GTA GAT ATT GAT GCC AGC ACA GAT ATC CTA 906 

10 Ala Ser Glu Arg Gly Gly Val Asp lie Asp Ala Ser Thr Asp lie Leu 
215 220 225 230 



GCA GAT GAG GCT CTG CTG AAT GAC GAA GCG AGG CAG CTT CTG TCA AGG 954 
Ala Asp Glu Ala Leu Leu Asn Asp Glu Ala Arg Gin Leu Leu Ser Arg 
15 235 240 245 



AAA GTT TCA ATT CCT TCA TCA CGG ATC AAT CCT TAC AGA ATG GTT ATT 1002 

Lys Val Ser lie Pro Ser Ser Arg lie Asn Pro Tyr Arg Met Val lie 

250 255 260 

20 

ATG CTG CGG CTT GTT ATC CTT TGT CTC TTC TTG CAT TAC CGT ATA ACA 1050 

Met Leu Arg Leu Val lie Leu Cys Leu Phe Leu His Tyr Arg lie Thr 
265 270 275 



25 AAC CCA GTG CCA AAT GCC TTT GCT CTA TGG CTG GTC TCT GTG ATA TGT 109 8 

Asn Pro Val Pro Asn Ala Phe Ala Leu Trp Leu Val Ser Val He Cys 

280 285 290 

GAG ATC TGG TTT GCC TTA TCC TGG ATT TTG GAT CAG TTT CCC AAG TGG 114 6 

30 Glu He Trp Phe Ala Leu Ser Trp He Leu Asp Gin Phe Pro Lys Trp 

295 300 305 310 



TTT CCT GTG AAC CGT GAA ACC TAC CTC GAC AGG CTT GCT TTA AGA TAT 1194 

Phe Pro Val Asn Arg Glu Thr Tyr Leu Asp Arg Leu Ala Leu Arg Tyr 
35 315 320 325 

GAT CGT GAA GGT GAG CCA TCA CAG TTA GCT GCT GTT GAC ATT TTC GTG 124 2 

Asp Arg Glu Gly Glu Pro Ser Gin Leu Ala Ala Val Asp He Phe Val 

330 335 340 

40 
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AGT ACT GTT GAC CCC TTG AAG GAG CCA CCC CTT GTG ACA GCC AAC ACA 12 90 

Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Leu Val Thr Ala Asn Thr 
345 350 355 

5 GTG CTC TCT ATT CTG GCT GTT GAC TAC CCA GTT GAC AAG GTG TCC TGT 13 3 8 

Val Leu Ser lie Leu Ala Val Asp Tyr Pro Val Asp Lys Val Ser Cys 
360 365 370 

TAT GTT TCT GAT GAT GGT GCT GCT ATG TTA TCA TTT GAA TCA CTT GCA 13 86 

10 Tyr Val Ser Asp Asp Gly Ala Ala Met Leu Ser Phe Glu Ser Leu Ala 
375 380 385 390 

GAA ACA TCA GAG TTT GCT CGT AAA TGG GTA CCA TTT TGC AAG AAA TAT 14 34 

Glu Thr Ser Glu Phe Ala Arg Lys Trp Val Pro Phe Cys Lys Lys Tyr 
15 395 400 405 

AGC ATA GAG CCT CGT GCA CCA GAA TGG TAC TTT GCT GCG AAA ATA GAT 14 82 

Ser lie Glu Pro Arg Ala Pro Glu Trp Tyr Phe Ala Ala Lys lie Asp 

410 415 420 

20 

TAC TTG AAG GAT AAA GTT CAG ACA TCA TTT GTC AAA GAT CGT AGA GCT 153 0 

Tyr Leu Lys Asp Lys Val Gin Thr Ser Phe Val Lys Asp Arg Arg Ala 
425 430 435 

25 ATG AAG AGG GAA TAT GAG GAA TTT AAA ATC CGA ATC AAT GCA CTT GTT 157 8 

Met Lys Arg Glu Tyr Glu Glu Phe Lys lie Arg lie Asn Ala Leu Val 
440 445 450 

TCC AAA GCC CTA AAA TGT CCT GAA GAA GGG TGG GTT ATG CAA GAT GGC 1626 
30 Ser Lys Ala Leu Lys Cys Pro Glu Glu Gly Trp Val Met Gin Asp Gly 
455 460 465 470 

ACA CCG TGG CCT GGA AAT AAT ACA GGG GAC CAT CCA GGA ATG ATC CAG 16 74 

Thr Pro Trp Pro Gly Asn Asn Thr Gly Asp His Pro Gly Met lie Gin 
35 475 480 485 

GTC TTC TTA GGG CAA AAT GGT GGA CTT GAT GCA GAG GGC AAT GAG CTC 172 2 

Val Phe Leu Gly Gin Asn Gly Gly Leu Asp Ala Glu Gly Asn Glu Leu 

490 495 500 

40 
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CCG CGT TTG GTA TAT GTT TCT CGA GAA AAG CGA CCA GGA TTC GAG CAC 1770 

Pro Arg Leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Gin His 
505 510 515 

5 CAC AAA AAG GCT GGT GCT ATG AAT GCA CTG GTG AGA GTT TCA GCA GTT 1818 

His Lys Lys Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser Ala Val 
520 525 530 



CTT ACC AAT GGA CCT TTC ATC TTG AAT CTT GAT TGT GAT CAT TAC ATA 186 6 

10 Leu Thr Asn Gly Pro Phe He Leu Asn Leu Asp Cys Asp His Tyr He 
535 540 545 550 



AAT AAC AGC AAA 
Asn Asn Ser Lys 

15 

AAC CTC GGG AAG 
Asn Leu Gly Lys 

570 

20 

GGT ATC GAT AAG 
Gly lie Asp Lys 
585 

25 GAT ATT AAC TTG 
Asp He Asn Leu 
600 



GCC TTA AGA GAA GCA 
Ala Leu Arg Glu Ala 
555 

CAA GTT TGT TAT GTT 
Gin Val Cys Tyr Val 

575 

AAC GAT AGA TAT GCT 
Asn Asp Arg Tyr Ala 

590 

AGA GGT TTA GAT GGG 
Arg Gly Leu Asp Gly 
605 



ATG TGC TTC CTG ATG 
Met Cys Phe Leu Met 
560 

CAG TTC CCA CAA AGA 
Gin Phe Pro Gin Arg 

580 

AAT CGT AAT ACC GTG 
Asn Arg Asn Thr Val 

595 

ATT CAA GGA CCT GTA 
He Gin Gly Pro Val 
610 



GAC CCA 1914 

Asp Pro 

565 

TTT GAT 1962 
Phe Asp 

TTC TTT 2 010 

Phe Phe 

TAT GTC 2058 
Tyr Val 



GGA ACT GGA TGT GTT TTC AAC AGA ACA GCA TTA TAC GGT TAT GAA CCT 2106 
30 Gly Thr Gly Cys Val Phe Asn Arg Thr Ala Leu Tyr Gly Tyr Glu Pro 
615 620 625 630 



CCA ATA AAA GTA AAA CAC AAG AAG CCA AGT CTT TTA TCT AAG CTC TGT 2154 
Pro He Lys Val Lys His Lys Lys Pro Ser Leu Leu Ser Lys Leu Cys 
35 635 640 645 



GGT GGA TCA AGA AAG AAG AAT TCC AAA GCT AAG AAA GAG TCG GAC AAA 2202 

Gly Gly Ser Arg Lys Lys Asn Ser Lys Ala Lys Lys Glu Ser Asp Lys 

650 655 ; 660 

40 
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AAG AAA TCA GGC AGG CAT ACT GAC TCA ACT GTT CCT GTA TTC AAC CTC 225 0 

Lys Lys Ser Gly Arg His Thr Asp Ser Thr Val Pro Val Phe Asn Leu 
665 670 675 

5 GAT GAC ATA GAA GAG GGA GTT GAA GGT GCT GGT TTT GAT GAT GAA AAG 2298 
Asp Asp lie Glu Glu Gly Val Glu Gly Ala Gly Phe Asp Asp Glu Lys 
680 685 690 

GCG CTC TTA ATG TCG CAA ATG AGC CTG GAG AAG CGA TTT GGA CAG TCT 234 6 

10 Ala Leu Leu Met Ser Gin Met Ser Leu Glu Lys Arg Phe Gly Gin Ser 
695 700 705 710 

GCT GTT TTT GTT GCT TCT ACC CTA ATG GAA AAT GGT GGT GTT CCT CCT 23 94 

Ala Val Phe Val Ala Ser Thr Leu Met Glu Asn Gly Gly Val Pro Pro 
15 715 720 725 

TCA GCA ACT CCA GAA AAC TTT CTC AAA GAG GCT ATC CAT GTC ATT AGT 244 2 

Ser Ala Thr Pro Glu Asn Phe Leu Lys Glu Ala He His Val He Ser 

730 735 740 

20 

TGT GGT TAT GAG GAT AAG TCA GAT TGG GGA ATG GAG ATT GGA TGG ATC 2 4 90 

Cys Gly Tyr Glu Asp Lys Ser Asp Trp Gly Met Glu. He Gly Trp He 

745 750 755 

25 TAT GGT TCT GTG ACA GAA GAT ATT CTG ACT GGG TTC AAA ATG CAT GCC 253 8 

Tyr Gly Ser Val Thr Glu Asp He Leu Thr Gly Phe Lys Met His Ala 
760 765 770 

CGT GGA TGG CGA TCC ATT TAC TGC ATG CCT AAG CTT CCA GCT TTC AAG 2586 
30 Arg Gly Trp Arg Ser He Tyr Cys Met Pro Lys Leu Pro Ala Phe Lys 
775 780 785 790 

GGT TCT GCT CCT ATC AAT CTT TCA GAT CGT CTG AAC CAA GTG CTG AGG 2 63 4 

Gly Ser Ala Pro He Asn Leu Ser Asp Arg Leu Asn Gin Val Leu Arg 
35 795 800 805 

TGG GCT TTA GGT TCA GTT GAG ATT CTC TTC AGT CGG CAT TGT CCT ATA 2682 

Trp Ala Leu Gly Ser Val Glu He Leu Phe Ser Arg His Cys Pro He 

810 815 820 

40 
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TGG TAT GGT TAC AAT GGG AGG CTA AAA TTT CTT GAG AGG TTT GCG TAT 273 0 

Trp Tyr Gly Tyr Asn Gly Arg Leu Lys Phe Leu Glu Arg Phe Ala Tyr 
825 830 835 

5 GTG AAC ACC ACC ATC TAC CCT ATC ACC TCC ATT CCT CTT CTC ATG TAT 27 7 8 

Val Asn Thr Thr lie Tyr Pro lie Thr Ser lie Pro Leu Leu Met Tyr 
840 845 850 

TGT ACA TTG CTA GCC GTT TGT CTC TTC ACC AAC CAG TTT ATT ATT CCT 28 26 

10 Cys Thr Leu Leu Ala Val Cys Leu Phe Thr Asn Gin Phe lie lie Pro 
855 860 865 870 

CAG ATT AGT AAC ATT GCA AGT ATA TGG TTT CTG TCT CTC TTT CTC TCC 2 8 74 

Gin He Ser Asn He Ala Ser He Trp Phe Leu Ser Leu Phe Leu Ser 
15 875 880 885 

ATT TTC GCC ACG GGT ATA CTA GAA ATG AGG TGG AGT GGC GTA GGC ATA 2 922 

He Phe Ala Thr Gly He Leu Glu Met Arg Trp Ser Gly Val Gly He 

890 895 900 

20 

GAC GAA TGG TGG AGA AAC GAG CAG TTT TGG GTC ATT GGT GGA GTA TCC 2 970 

Asp Glu Trp Trp Arg Asn Glu Gin Phe Trp Val He Gly Gly Val Ser 
905 910 915 

25 GCT CAT TTA TTC GCT GTG TTT CAA GGT ATC CTC AAA GTC CTT GCC GGT 3 018 

Ala His Leu Phe Ala Val Phe Gin Gly He Leu Lys Val Leu Ala Gly 
920 925 930 

ATT GAC ACA AAC TTC ACA GTT ACC TCA AAA GCT TCA GAT GAA GAC GGA 3 066 

30 He Asp Thr Asn Phe Thr Val Thr Ser Lys Ala Ser Asp Glu Asp Gly 
935 940 945 950 

GAC TTT GCT GAG CTC TAC TTG TTC AAA TGG ACA ACA CTT CTG ATT CCG 3114 
Asp Phe Ala Glu Leu Tyr Leu Phe Lys Trp Thr Thr Leu Leu He Pro 
35 955 960 965 

CCA ACG ACG CTG CTC ATT GTA AAC TTA GTG GGA GTT GTT GCA GGA GTC 3162 

Pro Thr Thr Leu Leu He Val Asn Leu Val Gly Val Val Ala Gly Val 

970 975 980 

40 
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TCT TAT GCT ATC AAC AGT GGA TAC CAA TCA TGG GGA CCA CTC TTT GGT 3 210 

Ser Tyr Ala He Asn Ser Gly Tyr Gin Ser Trp Gly Pro Leu Phe Gly 
985 990 995 

5 AAG TTG TTC TTT GCC TTC TGG GTG ATT GTT CAC TTG TAC CCT TTC CTC 3 2 58 

Lys Leu Phe Phe Ala Phe Trp Val He Val His Leu Tyr Pro Phe Leu 
1000 1005 1010 

AAG GGT TTG ATG GGT CGA CAG AAC CGG ACT CCT ACC ATT GTT GTG GTC 3 306 

10 Lys Gly Leu Met Gly Arg Gin Asn Arg Thr Pro Thr He Val Val Val 
1015 1020 1025 1030 

t 

TGG TCT GTT CTC TTG GCT TCT ATC TTC TCG TTG TTG TGG GTT AGG ATT 3 3 54 

Trp Ser Val Leu Leu Ala Ser He Phe Ser Leu Leu Trp Val Arg He 
15 1035 1040 1045 

GAT CCC TTC ACT AGC CGA GTC ACT GGC CCG GAC ATT CTG GAA TGT GGA 3 4 02 

Asp Pro Phe Thr Ser Arg Val Thr Gly Pro Asp He Leu Glu Cys Gly 

1050 1055 1060 

20 

ATC AAC TGT TG AG AAG CGA GCAAATATTT ACCTGTTTTG AGGG TTAAAA 3 4 51 

He Asn Cys 
1065 

25 AAAAGACAGA ATTTAAATTA TTTTTCATTG TTTTATTTGT TCACTTTTTT ACT TTTG TTG 3 511 
TGTGTATCTG TCTGTTCGTT CTTCTGTCTT GGTGTCATAA ATTTATG TGT AG AATATAT C 3 571 

TT AC TCT AGT TACTTTGGAA AGTTATAATT AAAGTGAAAG CCA 3 614 

30 



(2) INFORMATION FOR SEQ ID NO: 10: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1065 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

40 
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(in MOLECULE TYPE: procein 

\X\) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

5 Met Glu Ser Glu Gly Glu Thr Ala Gly Lys Pro Met Lys Asn He Val 
15 10 15 

Pro Gin Thr Cys Gin He Cys Ser Asp Asn Val Gly Lys Thr Val Asp 

20 25 30 

10 

Gly Asp Arg Phe Val Ala Cys Asp He Cys Ser Phe Pro Val Cys Arg 
35 40 45 

Pro Cys Tyr Glu Tyr Glu Arg Lys Asp Gly Asn Gin Ser Cys Pro Gin 
15 50 55 60 

Cys Lys Thr Arg Tyr Lys Arg Leu Lys Gly Ser Pro Ala He Pro Gly 
65 70 75 80 

20 Asp Lys Asp Glu Asp Gly Leu Ala Asp Glu Gly Thr Val Glu Phe Asn 

65 90 95 

Tyr Pro Gin Lys Glu Lys He Ser Glu Arg Met Leu Gly Trp His Leu 

100 105 110 

25 

Thr Arg Gly Lys Gly Glu Glu Met Gly Glu Pro Gin Tyr Asp Lys Glu 
115 120 125 

Val Ser His Asn His Leu Pro Arg Leu Thr Ser Arg Gin Asp Thr Ser 
30 130 135 140 

Gly Glu Phe Ser Ala Ala Ser Pro Glu Arg Leu Ser Val Ser Ser Thr 
145 150 155 160 

35 He Ala Gly Gly Lys Arg Leu Pro Tyr Ser Ser Asp Val Asn Gin Ser 

165 170 175 

Pro Asn Arg Arg He Val Asp Pro Val Gly Leu Gly Asn Val Ala Trp 

180 185 190 

40 
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Lys Glu Arg Val Asp Gly Trp Lys Met Lys Gin Glu Lys Asn Thr Gly 
195 200 205 

Pro Val Ser Thr Gin Ala Ala Ser Glu Arg Gly Gly Val Asp lie Asp 
5 210 215 220 



Ala Ser Thr Asp lie Leu Ala Asp Glu Ala Leu Leu Asn Asp Glu Ala 

225 230 235 240 

10 Arg Gin Leu Leu Ser Arg Lys Val Ser lie Pro Ser Ser Arg lie Asn 

245 250 255 



Pro Tyr Arg Met Val 

260 

15 

Leu His Tyr Arg He 
275 

Leu Val Ser Val He 
20 290 

Asp Gin Phe Pro Lys 

30S 

25 Arg Leu Ala Leu Arg 

325 

Ala Val Asp He Phe 

340 

30 

Leu Val Thr Ala Asn 
355 



He Met Leu Arg Leu 

265 

Thr Asn Pro Val Pro 
280 

Cys Glu He Trp Phe 
295 

Trp Phe Pro Val Asn 
310 

Tyr Asp Arg Glu Gly 

330 

Val Ser Thr Val Asp 

345 

Thr Val Leu Ser He 
360 



Val He Leu Cys Leu Phe 

270 

Asn Ala Phe Ala Leu Trp 
2B5 

Ala Leu Ser Trp He Leu 
300 

Arg Glu Thr Tyr Leu Asp 

315 320 

Glu Pro Ser Gin Leu Ala 

335 

Pro Leu Lys Glu Pro Pro 

350 

Leu Ala Val Asp Tyr Pro 
365 



Val Asp Lys Val Ser Cys Tyr Val Ser Asp Asp Gly Ala Ala Met Leu 

35 370 375 380 

Ser Phe Glu Ser Leu Ala Glu Thr Ser Glu Phe Ala Arg Lys Trp Val 

385 390 395 400 
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Pro Phe Cys Lys Lys Tyr Ser lie Glu Pro Arg Ala Pro Glu Trp Tyr 

405 410 415 

Phe Ala Ala Lys lie Asp Tyr Leu Lys Asp Lys Val Gin Thr Ser Phe 
5 420 425 430 

Val Lys Asp Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys lie 
435 440 445 

10 Arg lie Asn Ala Leu Val Ser Lys Ala Leu Lys Cys Pro Glu Glu Gly 
450 455 460 

Trp Val Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Gly Asp 
465 470 475 480 

15 

His Pro Gly Met lie Gin Val Phe Leu Gly Gin Asn Gly Gly Leu Asp 

485 490 495 

Ala Glu Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser Arg Glu Lys 
20 500 505 510 

Arg Pro Gly Phe Gin His His Lys Lys Ala Gly Ala Met Asn Ala Leu 
515 520 525 

25 Val Arg Val Ser Ala Val Leu Thr Asn Gly Pro Phe He Leu Asn Leu 
530 535 540 

Asp Cys Asp His Tyr lie Asn Asn Ser Lys Ala Leu Arg Glu Ala Met 
545 550 555 560 

30 

Cys Phe Leu Met Asp Pro Asn Leu Gly Lys Gin Val Cys Tyr Val Gin 

565 570 575 

Phe Pro Gin Arg Phe Asp Gly He Asp Lys Asn Asp Arg Tyr Ala Asn 
35 580 585 590 

Arg Asn Thr Val Phe Phe Asp He Asn Leu Arg Gly Leu Asp Gly He 
595 600 605 
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Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe Asn Arg Thr Ala 
610 615 620 

Leu Tyr Gly Tyr Glu Pro Pro lie Lys Val Lys His Lys Lys Pro Ser 
5 625 630 635 640 

Leu Leu Ser Lys Leu Cys Gly Gly Ser Arg Lys Lys Asn Ser Lys Ala 

645 650 655 

10 Lys Lys Glu Ser Asp Lys Lys Lys Ser Gly Arg His Thr Asp Ser Thr 

660 665 670 

Val Pro Val Phe Asn Leu Asp Asp lie Glu Glu Gly Val Glu Gly Ala 
675 680 685 

15 

Gly Phe Asp Asp Glu Lys Ala Leu Leu Met Ser Gin Met Ser Leu Glu 
690 695 700 

Lys Arg Phe Gly Gin Ser Ala Val Phe Val Ala Ser Thr Leu Met Glu 
20 705 710 715 720 

Asn Gly Gly Val Pro Pro Ser Ala Thr Pro Glu Asn Phe Leu Lys Glu 

725 730 735 

25 Ala He His Val He Ser Cys Gly Tyr Glu Asp Lys Ser Asp Trp Gly 

740 745 750 

Met Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp He Leu Thr 
755 760 765 

30 

Gly Phe Lys Met His Ala Arg Gly Trp Arg Ser He Tyr Cys Met Pro 
770 775 780 

Lys Leu Pro Ala Phe Lys Gly Ser Ala Pro He Asn Leu Ser Asp Arg 
35 785 790 795 800 



Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu He Leu Phe 

805 810 815 
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Ser Arg His Cys Pro lie Trp Tyr Gly Tyr Asn Gly Arg Leu Lys Phe 

820 825 830 

Leu Glu Arg Phe Ala Tyr Val Asn Thr Thr lie Tyr Pro lie Thr Ser 
5 835 840 845 

He Pro Leu Leu Met Tyr Cys Thr Leu Leu Ala Val Cys Leu Phe Thr 
850 855 860 

10 Asn Gin Phe He He Pro Gin He Ser Asn He Ala Ser He Trp Phe 
865 870 875 880 

Leu Ser Leu Phe Leu Ser He Phe Ala Thr Gly He Leu Glu Met Arg 

885 890 895 

15 

Trp Ser Gly Val Gly He Asp Glu Trp Trp Arg Asn Glu Gin Phe Trp 

900 905 910 

Val He Gly Gly Val Ser Ala His Leu Phe Ala Val Phe Gin Gly He 
20 915 920 925 

Leu Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr Ser Lys 
930 935 940 

25 Ala Ser Asp Glu Asp Gly Asp Phe Ala Glu Leu Tyr Leu Phe Lys Trp 
945 950 955 960 

Thr Thr Leu Leu He Pro Pro Thr Thr Leu Leu He Val Asn Leu Val 

965 970 975 

30 

Gly Val Val Ala Gly Val Ser Tyr Ala He Asn Ser Gly Tyr Gin Ser 

980 985 990 

Trp Gly Pro Leu Phe Gly Lys Leu Phe Phe Ala Phe Trp Val He Val 
35 995 1000 1005 

His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin Asn Arg Thr 
1010 1015 1020 
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Pro Thr He Val Val Val Trp Ser Val Leu Leu Ala Ser He Phe Ser 
1025 1030 1035 1040 

Leu Leu Trp val Arg He Asp Pro Phe Thr Ser Arg Val Thr Gly Pro 
5 1045 1050 1055 

Asp He Leu Glu Cys Gly He Asn Cys 

1060 1065 



10 



(2) INFORMATION FOR SEQ ID NO: 11: 

(il SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 3673 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : single 
(DJ TOPOLOGY: linear 

20 <ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 



25 



30 



35 



{iv> ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 

(B) STRAIN: Columbia 

(C) INDIVIDUAL ISOLATE: rswl mutant 

(ix) FEATURE: 

(A I NAME /KEY: CDS 

(B) LOCATION: 71.. 3313 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GAATCGGCTA CGAATTTCCC AATTTTGAAT TTTGTGAATC T CT CTCTTTC TCTGTGTGTC 60 
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GGTGGCTGCG ATG GAG GCC AGT GCC GGC TTG GTT GCT GGA TCC TAC CGG 

Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser Tyr Arg 
15 10 

5 AGA AAC GAG CTC GTT CGG ATC CGA CAT GAA TCT GAT GGC GGG ACC AAA 
Arg Asn Glu Leu Val Arg lie Arg His Glu Ser Asp Gly Gly Thr Lys 
15 20 25 



CCT TTG AAG AAT ATG AAT GGC CAG ATA TGT CAG ATC TGT GGT GAT GAT 
10 Pro Leu Lys Asn Met Asn Gly Gin lie Cys Gin lie Cys Gly Asp Asp 

30 35 40 45 



GTT GGA CTC GCT GAA ACT GGA GAT GTC TTT GTC GCG TGT AAT GAA TGT 
Val Gly Leu Ala Glu Thr Gly Asp Val Phe Val Ala Cys Asn Glu Cys 
15 50 55 60 



GCC TTC CCT GTG TGT CGG CCT TGC TAT GAG TAC GAG AGG AAA GAT GGA 

Ala Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Lys Asp Gly 

65 70 75 

20 

ACT CAG TGT TGC CCT CAA TGC AAG ACT AGA TTC AGA CGA CAC AGG GGG 

Thr Gin Cys Cys Pro Gin Cys Lys Thr Arg Phe Arg Arg His Arg Gly 
80 85 90 



25 AGT CCT CGT GTT GAA GGA GAT GAA GAT GAG GAT GAT GTT GAT GAT ATC 

Ser Pro Arg Val Glu Gly Asp Glu Asp Glu Asp Asp Val Asp Asp lie 

95 100 105 

GAG AAT GAG TTC AAT TAC GCC CAG GGA GCT AAC AAG GCG AGA CAC CAA 

30 Glu Asn Glu Phe Asn Tyr Ala Gin Gly Ala Asn Lys Ala Arg His Gin 

110 115 120 125 



CGC CAT GGC GAA GAG TTT TCT TCT TCC TCT AGA CAT GAA TCT CAA CCA 
Arg His Gly Glu Glu Phe Ser Ser Ser Ser Arg His Glu Ser Gin Pro 
35 130 135 140 



ATT CCT CTT CTC ACC CAT GGC CAT ACG GTT TCT GGA GAG ATT CGC ACG 

lie Pro Leu Leu Thr His Gly His Thr Val Ser Gly Glu lie Arg Thr 

145 150 155 

40 



PCT/AU97/00402 
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157 



205 



253 



301 



349 



397 



445 



493 
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CCT GAT ACA CAA TCT GTG CGA ACT ACA TCA GGT CCT TTG GGT CCT TCT 58 9 

Pro Asp Thr Gin Ser Val Arg Thr Thr Ser Gly Pro Leu Gly Pro Ser 
160 165 170 

5 GAC AGG AAT GCT ATT TCA TCT CCA TAT ATT GAT CCA CGG CAA CCT GTC 6 37 

Asp Arg Asn Ala lie Ser Ser Pro Tyr lie Asp Pro Arg Gin Pro Val 
175 180 185 

CCT GTA AGA ATC GTG GAC CCG TCA AAA GAC TTG AAC TCT TAT GGG CTT 6 85 

10 Pro Val Arg lie Val Asp Pro Ser Lys Asp Leu Asn Ser Tyr Gly Leu 
190 195 200 205 

GGT AAT GTT GAC TGG AAA GAA AGA GTT GAA GGC TGG AAG CTG AAG CAG 73 3 

Gly Asn Val Asp Trp Lys Glu Arg Val Glu Gly Trp Lys Leu Lys Gin 
15 210 215 220 

GAG AAA AAT ATG TTA CAG ATG ACT GGT AAA TAC CAT GAA GGG AAA GGA 781 
Glu Lys Asn Met Leu Gin Met Thr Gly Lys Tyr His Glu Gly Lys Gly 

225 230 235 

20 

GGA GAA ATT GAA GGG ACT GGT TCC AAT GGC GAA GAA CTC CAA ATG GCT 82 9 

Gly Glu He Glu Gly Thr Gly Ser Asn Gly Glu Glu Leu Gin Met Ala 
240 245 250 

25 GAT GAT ACA CGT CTT CCT ATG AGT CGT GTG GTG CCT ATC CCA TCT TCT 877 
Asp Asp Thr Arg Leu Pro Met Ser Arg Val Val Pro He Pro Ser Ser 
255 260 265 

CGC CTA ACC CCT TAT CGG GTT GTG ATT ATT CTC CGG CTT ATC ATC TTG 925 
30 Arg Leu Thr Pro Tyr Arg Val Val He lie Leu Arg Leu He He Leu 
270 275 280 285 

TGT TTC TTC TTG CAA TAT CGT ACA ACT CAC CCT GTG AAA AAT GCA TAT 973 
Cys Phe Phe Leu Gin Tyr Arg Thr Thr His Pro Val Lys Asn Ala Tyr 
35 290 295 300 

CCT TTG TGG TTG ACC TCG GTT ATC TGT GAG ATC TGG TTT GCA TTT TCT 1021 

Pro Leu Trp Leu Thr Ser Val He Cys Glu He Trp Phe Ala Phe Ser 

305 310 315 

40 
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TGG CTT CTT GAT CAG TTT CCC AAA TGG TAC CCC ATT AAC AGG GAG ACT 106 9 

Trp Leu Leu Asp Gin Phe Pro Lys Trp Tyr Pro lie Asn Arg Glu Thr 
320 325 330 

5 TAT CTT GAC CGT CTC GCT ATA AG A TAT GAT CGA GAC GGT GAA CCA TCA 1117 

Tyr Leu Asp Arg Leu Ala lie Arg Tyr Asp Arg Asp Gly Glu Pro Ser 
335 340 345 

CAG CTC GTT CCT GTT GAT GTG TTT GTT AGT ACA GTG GAC CCA TTG AAA 116 5 

10 Gin Leu Val Pro Val Asp Val Phe Val Ser Thr Val Asp Pro Leu Lys 
350 355 360 365 

GAG CCT CCC CTT GTT ACA GCA AAC ACA GTT CTC TCG ATT CTT TCT GTG 1213 

Glu Pro Pro Leu Val Thr Ala Asn Thr Val Leu Ser lie Leu Ser Val 
15 370 375 380 

GAC TAC CCG GTA GAT AAA GTA GCC TGT TAT GTT TCA GAT GAT GGT TCA 1261 

Asp Tyr Pro Val Asp Lys Val Ala Cys Tyr Val Ser Asp Asp Gly Ser 

385 390 395 

20 

GCT ATG CTT ACC TTT GAA TCC CTT TCT GAA ACC GCT GAG TTT GCA AAG 13 0 9 

Ala Met Leu Thr Phe Glu Ser Leu Ser Glu Thr Ala Glu Phe Ala Lys 
400 405 410 

25 AAA TGG GTA CCA TTT TGC AAG AAA TTC AAC ATT GAA CCT AGG GCC CCT 13 57 

Lys Trp Val Pro Phe Cys Lys Lys Phe Asn lie Glu Pro Arg Ala Pro 
415 420 425 

GAA TTC TAT TTT GCC CAG AAG ATA GAT TAC TTG AAG GAC AAG ATC CAA 14 05 

30 Glu Phe Tyr Phe Ala Gin Lys lie Asp Tyr Leu Lys Asp Lys lie Gin 
430 435 440 445 

CCG TCT TTT GTT AAA GAG CGA CGA GCT ATG AAG AGA GAG TAT GAA GAG 14 5 3 

Pro Ser Phe Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu 
35 450 455 460 

TTT AAA GTG AGG ATA AAT GCT CTT GTT GCC AAA GCA CAG AAA ATC CCT 1501 

Phe Lys Val Arg lie Asn Ala Leu Val Ala Lys Ala Gin Lys lie Pro 

465 470 475 

40 
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GAA GAA GGC TGG 
Glu Glu Gly Trp 
480 

5 ACT AGA GAT CAT 
Thr Arg Asp His 
495 

GGT CTG GAT ACC 
10 Gly Leu Asp Thr 
510 

CGT GAA AAG CGG 
Arg Glu Lys Arg 

15. 

AAT GCA TTG ATC 
Asn Ala Leu lie 

545 

20 

TTG AAC GTG GAT 
Leu Asn Val Asp 
560 

25 GAA GCT ATG TGT 
Glu Ala Met Cys 
575 

TAT GTC GAG TTC 
30 Tyr Val Gin Phe 
590 

TAT GCC AAC AGG 
Tyr Ala Asn Arg 

35 

GAT GGT ATC CAG 
Asp Gly lie Gin 

625 

40 



ACA ATG CAG GAT GGT 
Thr Met Gin Asp Gly 

485 

CCT GGA ATG ATA CAG 
Pro Gly Met lie Gin 

500 

GAT GGA AAT GAG CTG 
Asp Gly Asn Glu Leu 

515 

CCT GGA TTT CAA CAC 
Pro Gly Phe Gin His 
530 

CGT GTA TCT GTT GTT 
Arg Val Ser Val Val 

550 

TGT GAT CAT TAC TTT 
Cys Asp His Tyr Phe 

565 

TTC ATG ATG GAC CCG 
Phe Met Met Asp Pro 
580 

CCT CAA CGT TTT GAC 
Pro Gin Arg Phe Asp 
595 

AAT ATA GTC TTT TTC 
Asn lie Val Phe Phe 
610 

GGT CCA GTA TAT GTG 
Gly Pro Val Tyr Val 

630 



ACT CCC TGG CCT GGT 
Thr Pro Trp Pro Gly 

490 

GTG TTC TTA GGC CAT 
Val Phe Leu Gly His 
505 

CCT AGA CTC ATC TAT 
Pro Arg Leu lie Tyr 
520 

CAC AAA AAG GCT GGA 
His Lys Lys Ala Gly 
535 

CTT ACC AAT GGA GCA 
Leu Thr Asn Gly Ala 

555 

AAT AAC AGT AAG GCT 
Asn Asn Ser Lys Ala 

570 

GCT ATT GGA AAG AAG 
Ala lie Gly Lys Lys 
585 

GGT ATT GAT TTG CAC 
Gly lie Asp Leu His 
600 

GAT ATT AAC ATG AAG 
Asp lie Asn Met Lys 
615 

GGT ACT GGT TGT TGT 
Gly Thr Gly Cys Cys 

635 



AAC AAC 154 9 

Asn Asn 

AGT GGG 1597 
Ser Gly 

GTT TCT 164 5 

Val Ser 
S25 

GCT ATG 16 93 

Ala Met 

540 

TAT CTT 1741 
Tyr Leu 

ATT AAA 178 9 

lie Lys 

TGC TGC 183 7 

Cys Cys 

GAT CGA 1885 
Asp Arg 
605 

GGG TTG 1933 

Gly Leu 

620 

TTT AAT 1981 
Phe Asn 
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AGG CAG GCT CTA TAT GGG TAT GAT CCT GTT TTG ACG GAA GAA GAT TTA 2 02 9 

Arg Gin Ala Leu Tyr Gly Tyr Asp Pro Val Leu Thr Glu Glu Asp Leu 
640 645 650 

5 GAA CCA AAT ATT ATT GTC AAG AGC TGT TGC GGG TCA AGG AAG AAA GGT 207 7 

Glu Pro Asn lie lie Val Lys Ser Cys Cys Gly Ser Arg Lys Lys Gly 
655 660 665 

AAA AGT AGC AAG AAG TAT AAC TAC GAA AAG AGG AGA GGC ATC AAC AGA 2125 
10 Lys Ser Ser Lys Lys Tyr Asn Tyr Glu Lys Arg Arg Gly lie Asn Arg 
670 675 680 685 

AGT GAC TCC AAT GCT CCA CTT TTC AAT ATG GAG GAC ATC GAT GAG GGT 2173 
Ser Asp Ser Asn Ala Pro Leu Phe Asn Met Glu Asp He Asp Glu Gly 
15 690 695 700 

TTT GAA GGT TAT GAT GAT GAG AGG TCT ATT CTA ATG TCC CAG AGG AGT 2 221 

Phe Glu Gly Tyr Asp Asp Glu Arg Ser lie Leu Met Ser Gin Arg Ser 

705 710 715 

20 

GTA GAG AAG CGT TTT GGT CAG TCG CCG GTA TTT ATT GCG GCA ACC TTC 226 9 

Val Glu Lys Arg Phe Gly Gin Ser Pro Val Phe He Ala Ala Thr Phe 

720 725 730 

25 ATG GAA CAA GGC GGC ATT CCA CCA ACA ACC AAT CCC GCT ACT CTT CTG 2317 
Met Glu Gin Gly Gly He Pro Pro Thr Thr Asn Pro Ala Thr Leu Leu 
735 740 745 

AAG GAG GCT ATT CAT GTT ATA AGC TGT GGT TAC GAA GAC AAG ACT GAA 23 6 5 

30 Lys Glu Ala He His Val He Ser Cys Gly Tyr Glu Asp Lys Thr Glu 
750 755 760 .765 

TGG GGC AAA GAG ATT GGT TGG ATC TAT GGT TCC GTG ACG GAA GAT ATT 2413 
Trp Gly Lys Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp He 
35 770 77S 780 

CTT ACT GGG TTC AAG ATG CAT GCC CGG GGT TGG ATA TCG ATC TAC TGC 2461 

Leu Thr Gly Phe Lys Met His Ala Arg Gly Trp He Ser He Tyr Cys 

785 790 795 

40 
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AAT CCT CCA CGC CCT GCG TTC AAG GGA TCT GCA CCA ATC AAT CTT TCT 2 509 

Asn Pro Pro Arg Pro Ala Phe Lys Gly Ser Ala Pro lie Asn Leu Ser 
800 805 810 

5 GAT CGT TTG AAC CAA GTT CTT CGA TGG GCT TTG GGA TCT ATC GAG ATT 25 57 

Asp Arg Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser lie Glu lie 
815 820 825 



CTT CTT AGC AGA CAT TGT CCT ATC TGG TAT GGT TAC CAT GGA AGG TTG 26 05 

10 Leu Leu Ser Arg His Cys Pro He Trp Tyr Gly Tyr His Gly Arg Leu 
830 835 840 845 

AGA CTT TTG GAG AGG ATC GCT TAT ATC AAC ACC ATC GTC TAT CCT ATT 26 53 

Arg Leu Leu Glu Arg He Ala Tyr He Asn Thr He Val Tyr Pro He 
15 850 855 860 



ACA TCC ATC CCT CTT ATT GCG TAT TGT ATT CTT CCC GCT TTT TGT CTC 2 701 

Thr Ser He Pro Leu He Ala Tyr Cys He Leu Pro Ala Phe Cys Leu 

865 870 875 

20 

ATC ACC GAC AGA TTC ATC ATA CCC GAG ATA AGC AAC TAC GCG AGT ATT 274 9 

He Thr Asp Arg Phe He lie Pro Glu He Ser Asn Tyr Ala Ser lie 
8B0 885 890 

25 TGG TTC ATT CTA CTC TTC ATC TCA ATT GCT GTG ACT GGA ATC CTG GAG 2 7 97 

Trp Phe He Leu Leu Phe He Ser He Ala Val Thr Gly He Leu Glu 
895 900 905 

CTG AGA TGG AGC GGT GTG AGC ATT GAG GAT TGG TGG AGG AAC GAG CAG 284 5 

30 Leu Arg Trp Ser Gly Val Ser He Glu Asp Trp Trp Arg Asn Glu Gin 
910 915 920 925 

TTC TGG GTC ATT GGT GGC ACA TCC GCC CAT CTT TTT GCT GTC TTC CAA 28 93 

Phe Trp Val He Gly Gly Thr Ser Ala His Leu Phe Ala Val Phe Gin 
35 930 935 940 

GGT CTA CTT AAG GTT CTT GCT GGT ATC GAC ACC AAC TTC ACC GTT ACA 2 941 

Gly Leu Leu Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr 

945 950 955 

40 
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TCT AAA GCC ACA GAC GAA GAT GGG GAT TTT GCA GAA CTC TAC ATC TTC 2 98 9 

Ser Lys Ala Thr Asp Glu Asp Gly Asp Phe Ala Glu Leu Tyr lie Phe 
960 965 970 

5 AAA TGG ACA GCT CTT CTC ATT CCA CCA ACC ACC GTC CTA CTT GTG AAC 3 03 7 

Lys Trp Thr Ala Leu Leu He Pro Pro Thr Thr Val Leu Leu Val Asn 
975 980 985 

CTC ATA GGC ATT GTG GCT GGT GTC TCT TAT GCT GTA AAC AGT GGC TAC 3 085 

10 Leu He Gly He Val Ala Gly Val Ser Tyr Ala Val Asn Ser Gly Tyr 
990 995 1000 1005 

CAG TCG TGG GGT CCG CTT TTC GGG AAG CTC TTC TTC GCC TTA TGG GTT 3133 
Gin Ser Trp Gly Pro Leu Phe Gly Lys Leu Phe Phe Ala Leu Trp Val 
15 1010 1015 1020 

ATT GCC CAT CTC TAC CCT TTC TTG AAA GGT CTG TTG GGA AGA CAA AAC 3181 
He Ala His Leu Tyr Pro Phe Leu Lys Gly Leu Leu Gly Arg Gin Asn 

1025 1030 1035 

20 

CGA ACA CCA ACC ATC GTC ATT GTC TGG TCT GTT CTT CTC GCC TCC ATC 3 22 9 

Arg Thr Pro Thr He Val He Val Trp Ser Val Leu Leu Ala Ser He 
1040 1045 1050 

25 TTC TCG TTG CTT TGG GTC AGG ATC AAT CCC TTT GTG GAC GCC AAT CCC 3 2 77 

Phe Ser Leu Leu Trp Val Arg He Asn Pro Phe Val Asp Ala Asn Pro 
1055 1060 106S 

AAT GCC AAC AAC TTC AAT GGC AAA GGA GGT GTC TTT TAGACCCTAT 33 23 

30 Asn Ala Asn Asn Phe Asn Gly Lys Gly Gly Val Phe 

1070 1075 1080 

TTATATACTT GTGTGTGCAT ATATCAAAAA CGCGCAATGG GAATTCCAAA TCATCTAAAC 3 3 83 

35 CCATCAAACC CCAGTGAACC GGGCAGTTAA GGTG AT TCC A TGTCCAAGAT TAGCTTTCTC 3443 

CG AG TAG C CA GAGAAGGTGA AATTGTTCGT AACACTATTG TAATGATTTT CCAGTGGGGA 3 503 

AGAAGATGTG GACCCAAATG ATACATAGTC TACAAAAAGA ATTTGTTATT C TTT CTT AT A 3 563 

40 
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TTTATTTTAT TTAAAGCTTG TTAGACTCAC ACTTATGTAA TGTTGGAACT TGTTGTCCTA 36 23 



AAAAGGG AT T GGAGTTTTCT TTTTATCTAA GAATCTGAAG TTTATATGCT 3 673 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1081 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(lii MOLECULE TYPE: protein 

<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser Tyr Arg Arg Asn Glu 
1 5 10 15 

Leu Val Arg lie Arg His Glu Ser Asp Gly Gly Thr Lys Pro Leu Lys 

20 25 30 

Asn Met Asn Gly Gin lie Cys Gin He Cys Gly Asp Asp Val Gly Leu 
35 40 45 

Ala Glu Thr Gly Asp Val Phe Val Ala Cys Asn Glu Cys Ala Phe Pro 
50 55 SO 

Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Lys Asp Gly Thr Gin Cys 
65 70 75 80 

Cys Pro Gin Cys Lys Thr Arg Phe Arg Arg His Arg Gly Ser Pro Arg 

85 90 95 

Val Glu Gly Asp Glu Asp Glu Asp Asp Val Asp Asp He Glu Asn Glu 

100 105 110 



WO 98/00549 PCT/AU97/0O402 



- 158- 



Phe Asn Tyr Ala Gin Gly Ala Asn Lys Ala Arg His Gin Arg His Gly 
115 120 125 

Glu Glu Phe Ser Ser Ser Ser Arg His Glu Ser Gin Pro lie Pro Leu 
5 130 135 140 

Leu Thr His Gly His. Thr Val Ser Gly Glu lie Arg Thr Pro Asp Thr 
145 150 155 160 

10 Gin Ser Val Arg Thr Thr Ser Gly Pro Leu Gly Pro Ser Asp Arg Asn 

165 170 175 

Ala lie Ser Ser Pro Tyr He Asp Pro Arg Gin Pro Val Pro Val Arg 

180 185 190 

15 

He Val Asp Pro Ser Lys Asp Leu Asn Ser Tyr Gly Leu Gly Asn Val 
195 200 205 

Asp Trp Lys Glu Arg Val Glu Gly Trp Lys Leu Lys Gin Glu Lys Asn 
20 210 215 220 

Met Leu Gin Met Thr Gly Lys Tyr His Glu Gly Lys Gly Gly Glu He 
225 230 235 240 

25 Glu Gly Thr Gly Ser Asn Gly Glu Glu Leu Gin Met Ala Asp Asp Thr 

245 250 255 

Arg Leu Pro Met Ser Arg Val Val Pro He Pro Ser Ser Arg Leu Thr 

260 265 270 

30 

Pro Tyr Arg Val Val He He Leu Arg Leu He He Leu Cys Phe Phe 
275 280 285 

Leu Gin Tyr Arg Thr Thr His Pro Val Lys Asn Ala Tyr Pro Leu Trp 
35 290 295 300 

Leu Thr Ser Val He Cys Glu He Trp Phe Ala Phe Ser Trp Leu Leu 
305 310 315 320 
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Asp Gin Phe Pro Lys Trp Tyr Pro lie Asn Arg Glu Thr Tyr Leu Asp 

325 330 335 

Arg Leu Ala lie Arg Tyr Asp Arg Asp Gly Glu Pro Ser Gin Leu Val 
5 340 345 350 

* 

Pro Val Asp Val Phe Val Ser Thr Val Asp Pro Leu Lys Glu Pro Pro 
355 360 365 



10 Leu Val Thr Ala Asn 
370 

Val Asp Lys Val Ala 
385 

15 

Thr Phe Glu Ser Leu 

405 

Pro Phe Cys Lys Lys 
20 420 

Phe Ala Gin Lys lie 
435 

25 Val Lys Glu Arg Arg 
450 

Arg lie Asn Ala Leu 
465 

30 

Trp Thr Met Gin Asp 

485 



Thr Val Leu Ser lie 

375 

Cys Tyr Val Ser Asp 
390 

Ser Glu Thr Ala Glu 

410 

Phe Asn lie Glu Pro 

425 

Asp Tyr Leu Lys Asp 
440 

Ala Met Lys Arg Glu 
455 

Val Ala Lys Ala Gin 
470 

Gly Thr Pro Trp Pro 

490 



Leu Ser Val Asp Tyr Pro 
380 

Asp Gly Ser Ala Met Leu 

395 400 

Phe Ala Lys Lys Trp Val 

415 

Arg Ala Pro Glu Phe Tyr 

430 

Lys lie Gin Pro Ser Phe 
445 

Tyr Glu Glu Phe Lys Val 
460 

Lys lie Pro Glu Glu Gly 
475 480 

Gly Asn Asn Thr Arg Asp 

495 



His Pro Gly Met lie Gin Val Phe 
35 500 

Thr Asp Gly Asn Glu Leu Pro Arg 

515 520 



Leu Gly His Ser Gly Gly Leu Asp 
505 510 

Leu lie Tyr Val Ser Arg Glu Lys 

525 
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Arg Pro Gly Phe Gin His His Lys Lys Ala Gly Ala Met Asn Ala Leu 
530 53S 540 

lie Arg Val Ser Val Val Leu Thr Asn Gly Ala Tyr Leu Leu Asn Val 
5 545 550 555 560 

Asp Cys Asp His Tyr Phe Asn Asn Ser Lys Ala lie Lys Glu Ala Met 

565 570 575 

10 Cys Phe Mec Met Asp Pro Ala He Gly Lys Lys Cys Cys Tyr Val Gin 

580 585 590 

Phe Pro Gin Arg Phe Asp Gly He Asp Leu His Asp Arg Tyr Ala Asn 
595 600 605 

15 

Arg Asn He Val Phe Phe Asp He Asn Met Lys Gly Leu Asp Gly He 
610 615 620 

Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Cys Phe Asn Arg Gin Ala 
20 625 630 635 640 

Leu Tyr Gly Tyr Asp Pro Val Leu Thr Glu Glu Asp Leu Glu Pro Asn 

645 650 655 

25 He He Val Lys Ser Cys Cys Gly Ser Arg Lys Lys Gly Lys Ser Ser 

660 665 670 

Lys Lys Tyr Asn Tyr Glu Lys Arg Arg Gly He Asn Arg Ser Asp Ser 
675 680 685 

30 

Asn Ala Pro Leu Phe Asn Met Glu Asp He Asp Glu Gly Phe Glu Gly 
690 695 700 

Tyr Asp Asp Glu Arg Ser He Leu Met Ser Gin Arg Ser Val Glu Lys 
35 705 710 715 720 

Arg Phe Gly Gin Ser Pro Val Phe He Ala Ala Thr Phe Met Glu Gin 

725 730 735 
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Gly Gly He Pro Pro Thr Thr Asn Pro Ala Thr Leu Leu Lys Glu Ala 

740 745 750 

He His Val He Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp Gly Lys 
5 755 760 765 

Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp He Leu Thr Gly 
770 775 780 

10 Phe Lys Met His Ala Arg Gly Trp He Ser He Tyr Cys Asn Pro Pro 
785 790 795 800 

Arg Pro Ala Phe Lys Gly Ser Ala Pro He Asn Leu Ser Asp Arg Leu 

805 810 815 

15 

Asn Gin Val Leu Arg Trp Ala Leu Gly Ser He Glu He Leu Leu Ser 

820 825 830 

Arg His Cys Pro He Trp Tyr Gly Tyr His Gly Arg Leu Arg Leu Leu 
20 835 840 845 

Glu Arg He Ala Tyr lie Asn Thr He Val Tyr Pro He Thr Ser He 
850 855 860 

25 Pro Leu He Ala Tyr Cys He Leu Pro Ala Phe Cys Leu He Thr Asp 
865 870 875 880 

Arg Phe He He Pro Glu He Ser Asn Tyr Ala Ser He Trp Phe He 

885 890 895 

30 

Leu Leu Phe He Ser He Ala Val Thr Gly He Leu Glu Leu Arg Trp 

900 905 910 

Ser Gly Val Ser He Glu Asp Trp Trp Arg Asn Glu Gin Phe Trp Val 
35 915 920 925 

He Gly Gly Thr Ser Ala His Leu Phe Ala Val Phe Gin Gly Leu Leu 
930 935 940 
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Lys Val Leu Ala Gly lie Asp Thr Asn Phe Thr Val Thr Ser Lys Ala 
945 950 955 960 

Thr Asp Glu Asp Gly Asp Phe Ala Glu Leu Tyr lie Phe Lys Trp Thr 
5 965 970 975 

Ala Leu Leu lie Pro Pro Thr Thr Val Leu Leu Val Asn Leu lie Gly 

980 985 990 

10 lie Val Ala Gly Val Ser Tyr Ala Val Asn Ser Gly Tyr Gin Ser Trp 

995 1000 1005 

Gly Pro Leu Phe Gly Lys Leu Phe Phe Ala Leu Trp Val lie Ala His 
1010 1015 1020 

15 

Leu Tyr Pro Phe Leu Lys Gly Leu Leu Gly Arg Gin Asn Arg Thr Pro 
1025 1030 1035 1040 

Thr lie Val lie Val Trp Ser Val Leu Leu Ala Ser lie Phe Ser Leu 
20 1045 1050 1055 

Leu Trp Val Arg lie Asn Pro Phe Val Asp Ala Asn Pro Asn Ala Asn 

1060 1065 1070 

25 Asn Phe Asn Gly Lys Gly Gly Val Phe 

1075 1080 



30 (2) INFORMATION FOR SEQ ID NO: 13: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1741 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
40 (iii) HYPOTHETICAL: NO 



BNSDOCtD: <WO__JW0064*M J_> 



WO 98/00549 



PCT/AU97/00402 



- 163 - 



l iv! ANT I - SENSE : NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Oryza sativa 

5 

fviii IMMEDIATE SOURCE: 

(B) CLONE: S0542 

(ix) FEATURE: 
10 (A) NAME/ KEY: CDS 

< B ) LOCATION: 101.. 1741 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



GTGCGGCCGC CGCGCATCTA GGCTTGCCGC GCGCGCGCGG ATCTGCGAGC TGCGTAGCCG 60 



TTTCTCGCTG TGAGTGGAGG AGGAGGAGGA AGGGAGGAGG ATG GCG GCG AAC GCG 115 

Met Ala Ala Asn Ala 

20 is 

GGG ATG GTG GCG GGA TCC CGC AAC CGG AAC GAG TTC GTC ATG ATC CGC 163 

Gly Met Val Ala Gly Ser Arg Asn Arg Asn Glu Phe Val Met lie Arg 

10 15 20 

25 

CCC GAC GGC GAC GCG CCA CCG CCG GCT AAG CCA GGG AAG AGT GTG AAT 211 

Pro Asp Gly Asp Ala Pro Pro Pro Ala Lys Pro Gly Lys Ser Val Asn 

25 30 35 

30 GGT CAG GTC TGC GAG ATT TGT GGC GAC ACT GTT GGC GTC TCG GCC ACC 259 
Gly Gin Val Cys Gin He Cys Gly Asp Thr Val Gly Val Ser Ala Thr 
40 45 50 

GGC GAC GTC TTT GTT GCC TGC AAT GAG TGC GCC TTC CCG GTC TGC CGC 3 07 

35 Gly Asp Val Phe Val Ala Cys Asn Glu Cys Ala Phe Pro Val Cys Arg 
55 60 65 

CCT TGC TAC GAG TAC GAA CGC AAG GAA GGG AAC CAG TGC TGC CCC CAG 355 
Pro Cys Tyr Glu Tyr Glu Arg Lys Glu Gly Asn Gin Cys Cys Pro Gin 
40 70 75 80 85 
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TGC AAG ACT AGA TAC AAG AGG CAC AAA GGT TGC CCT AGA GTT CAG GGC 4 03 

Cys Lys Thr Arg Tyr Lys Arg His Lys Gly Cys Pro Arg Val Gin Gly 

90 95 100 

5 GAT GAG GAA GAA GAA GAT GTT GAT GAC CTG GAC AAT GAA TTC CAT TAT 4 51 

Asp Glu Glu Glu Glu Asp Val Asp Asp Leu Asp Asn Glu Phe His Tyr 

105 110 115 

AAG CAT GGC AAT GGC AAA GGT CCA GAG TGG CAG ATA CAG AGA CAG GGG 49 9 

10 Lys His Gly Asn Gly Lys Gly Pro Glu Trp Gin lie Gin Arg Gin Gly 

120 125 130 

GAA GAT GTT GAC CTG TCT TCA TCT TCT CGC CAC GAA CAA CAT CGG ATT 54 7 

Glu Asp Val Asp Leu Ser Ser Ser Ser Arg His Glu Gin His Arg lie 
15 135 140 145 

CCC CGT CTG ACA AGT GGG CAA CAG ATC TCA GGA GAG ATC CCT GAT GCT 5 95 

Pro Arg Leu Thr Ser Gly Gin Gin lie Ser Gly Glu He Pro Asp Ala 
150 155 160 165 

20 

TCC CCC GAT CGC CAT TCT ATC CGC AGC GGA ACA TCA AGC TAT GTT GAT 64 3 

Ser Pro Asp Arg His Ser He Arg Ser Gly Thr Ser Ser Tyr Val Asp 

170 175 180 

25 CCA AGT GTT CCA GTT CCT GTG AGG ATT GTG GAC CCC TCC AAG GAC TTG 6 91 

Pro Ser Val Pro Val Pro Val Arg He Val Asp Pro Ser Lys Asp Leu 

185 190 195 

AAT TCC TAT GGG ATT AAC AGT GTT GAC TGG CAA GAA AGA GTT GCC AGC 73 9 

30 Asn Ser Tyr Gly He Asn Ser Val Asp Trp Gin Glu Arg Val Ala Ser 

200 205 210 

TGG AGG AAC AAG CAG GAC AAA AAT ATG ATG CAG GTA GCT AAT AAA TAT 787 
Trp Arg Asn Lys Gin Asp Lys Asn Met Met Gin Val Ala Asn Lys Tyr 
35 215 220 225 

CCA GAG GCA AGA GGG GGA GAC ATG GAA GGG ACT GGT TCA AAT GGT GAA 83 5 

Pro Glu Ala Arg Gly Gly Asp Met Glu Gly Thr Gly Ser Asn Gly Glu 
230 235 240 245 

40 
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GAT ATC CAA ATG GTT GAT GAT GCA CGT CTA CCT CTG AGC CGC ATA GTG 
Asp lie Gin Met Val Asp Asp Ala Arg Leu Pro Leu Ser Arg He Val 

250 255 260 

5 CCT ATC CCT TCA AAC CAG CTC AAC CTT TAC CGG ATT GTT ATC ATT CTC 

Pro He Pro Ser Asn Gin Leu Asn Leu Tyr Arg He Val He He Leu 

265 270 275 

CGT CTT ATC ATC CTG ATG TTC TTC TTC CAA TAT CGT GTC ACT CAT CCA 

10 Arg Leu He He Leu Met Phe Phe Phe Gin Tyr Arg Val Thr His Pro 

280 285 290 

GTG CGG GAT GCT TAT GGA TTG TGG CTA GTA TCT GTT ATC TGT GAA ATT 

Val Arg Asp Ala Tyr Gly Leu Trp Leu Val Ser Val He Cys Glu He 

15 295 300 305 

TGG TTG CCC TTA TCC TGG CTC CTA GAT CAA TTC CCA AAG TGG TAC CCG 

Trp Leu Pro Leu Ser Trp Leu Leu Asp Gin Phe Pro Lys Trp Tyr Pro 

310 315 320 325 

20 

ATA AAC CGT GAA ACA TAC CTT GAC AGG CTT GCA TTG AGA TAT GAT AGG 

He Asn Arg Glu Thr Tyr Leu Asp Arg Leu Ala Leu Arg Tyr Asp Arg 

330 335 340 

25 GAG GGA GAG CCA TCA CAG CTT GCT CCC ATT GAT GTC TTT GTC AGT ACG 

Glu Gly Glu Pro Ser Gin Leu Ala Pro He Asp Val Phe Val Ser Thr 

345 350 355 

GTG GAT CCA CTA AAG GAA CCT CCT CTG ATC ACA GCA AAC ACT GTT TTG 

30 Val Asp Pro Leu Lys Glu Pro Pro Leu He Thr Ala Asn Thr Val Leu 

360 365 370 

TCC ATT CTG GCT GTG GAT TAC CCT GTT GAC AAA GTG TCA TGC TAT GTT 

Ser He Leu Ala Val Asp Tyr Pro Val Asp Lys Val Ser Cys Tyr Val 

35 375 380 385 

TCT GAC GAT GGT TCA GCT ATG TTA ACT TTT GAG GCT CTG TCA GAA ACT 

Ser Asp Asp Gly Ser Ala Met Leu Thr Phe Glu Ala Leu Ser Glu Thr 

390 395 400 405 

40 
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GCA GAA TTT GCT AGG AAG TGG GTT CCG TTT TGC AAG AAG CAC AAT ATT 13 63 

Ala Glu Phe Ala Arg Lys Trp Val Pro Phe Cys Lys Lys His Asn lie 

410 415 420 

5 GAA CCA CGA GCT CCA GAG TTT TAC TTT GCT CAA AAA ATA GAT TAC CTG 14 11 

Glu Pro Arg Ala Pro Glu Phe Tyr Phe Ala Gin Lys lie Asp Tyr Leu 

425 430 435 

AAG GAC AAA ATC CAA CCT TCC TTT GTT AAA GAA AGG CGG GCA ATG AAG 14 5 9 

10 Lys Asp Lys lie Gin Pro Ser Phe Val Lys Glu Arg Arg Ala Met Lys 

440 445 450 

AGA GAG TAT GAA GAA TTC AAG GTA CGG ATC AAT GCT CTT GTT GCG AAG 1507 

Arg Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala Leu Val Ala Lys 

15 455 460 465 

GCA CAA AAA GTA CCT GAA GAG GGG TGG ACC ATG GCT GAT GGC ACT GCT 1555 

Ala Gin Lys Val Pro Glu Glu Gly Trp Thr Met Ala Asp Gly Thr Ala 
470 475 480 485 

20 

TGG CCT GGG AAT AAC CCA AGG GAT CAC CCT GGC ATG ATT CAG GTG TTC 1603 

Trp Pro Gly Asn Asn Pro Arg Asp His Pro Gly Met lie Gin Val Phe 

490 495 500 

25 TTG GGG CAC AGT GGT GGG CTT GAC ACT GAT GGT AAC GAG TTG CCA CGG 16 51 

Leu Gly His Ser Gly Gly Leu Asp Thr Asp Gly Asn Glu Leu Pro Arg 

505 510 515 

CTT GTC TAC GTC TCT CGT GAA AAG AGG CCA GGA TTC CAG CAT CAC AAG 16 9 9 

30 Leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Gin His His Lys 

520 525 530 

AAG GCT GGT GCA ATG AAT GCA TTG ATT CGT GTA TCT GCT GTG 1741 
Lys Ala Gly Ala Met Asn Ala Leu lie Arg Val Ser Ala Val 

35 535 540 545 



40 
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i2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 amino acids 
5 (B> TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Mec Ala Ala Asn Ala Gly Met Val Ala Gly Ser Arg Asn Arg Asn Glu 
15 10 15 

15 Phe Val Met He Arg Pro Asp Gly Asp Ala Pro Pro Pro Ala Lys Pro 

20 25 30 

Gly Lys Ser Val Asn Gly Gin Val Cys Gin lie Cys Gly Asp Thr Val 
35 40 45 

20 

Gly Val Ser Ala Thr Gly Asp Val Phe Val Ala Cys Asn Glu Cys Ala 
50 55 60 

Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Lys Glu Gly Asn 
25 65 70 75 80 

Gin Cys Cys Pro Gin Cys Lys Thr Arg Tyr Lys Arg His Lys Gly Cys 

B5 90 95 

30 Pro Arg Val Gin Gly Asp Glu Glu Glu Glu Asp Val Asp Asp Leu Asp 

100 105 110 

Asn Glu Phe His Tyr Lys His Gly Asn Gly Lys Gly Pro Glu Trp Gin 
115 120 125 

35 

He Gin Arg Gin Gly Glu Asp Val Asp Leu Ser Ser Ser Ser Arg His 
130 135 140 

Glu Gin His Arg He Pro Arg Leu Thr Ser Gly Gin Gin He Ser Gly 
40 145 150 155 160 
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Glu lie Pro Asp Ala Ser Pro Asp Arg His Ser lie Arg Ser Gly Thr 

165 170 175 

Ser Ser Tyr Val Asp Pro Ser Val Pro Val Pro Val Arg He Val Asp 
5 180 185 190 

Pro Ser Lys Asp Leu Asn Ser Tyr Gly He Asn Ser Val Asp Trp Gin 
195 200 205 

10 Glu Arg Val Ala Ser Trp Arg Asn Lys Gin Asp Lys Asn Met Met Gin 
210 215 220 

Val Ala Asn Lys Tyr Pro Glu Ala Arg Gly Gly Asp Met Glu Gly Thr 
225 230 235 240 

15 

Gly Ser Asn Gly Glu Asp He Gin Met Val Asp Asp Ala Arg Leu Pro 

245 250 255 

Leu Ser Arg He Val Pro He Pro Ser Asn Gin Leu Asn Leu Tyr Arg 
20 260 265 270 

He Val He He Leu Arg Leu He He Leu Met Phe Phe Phe Gin Tyr 
275 280 285 

25 Arg Val Thr His Pro Val Arg Asp Ala Tyr Gly Leu Trp Leu Val Ser 
290 295 300 

Val He Cys Glu He Trp Leu Pro Leu Ser Trp Leu Leu Asp Gin Phe 
305 310 315 320 

30 

Pro Lys Trp Tyr Pro He Asn Arg Glu Thr Tyr Leu Asp Arg Leu Ala 

325 330 335 

Leu Arg Tyr Asp Arg Glu Gly Glu Pro Ser Gin Leu Ala Pro He Asp 
35 340 345 350 



Val Phe Val Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Leu He Thr 
355 360 365 
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Ala Asn Thr Val Leu Ser He Leu Ala Val Asp Tyr Pro Val Asp Lys 
370 375 380 

5 Val Ser Cys Tyr Val Ser Asp Asp Gly Ser Ala Met Leu Thr Phe Glu 
385 390 395 400 

Ala Leu Ser Glu Thr Ala Glu Phe Ala Arg Lys Trp Val Pro Phe Cys 

405 410 415 

10 

Lys Lys His Asn lie Glu Pro Arg Ala Pro Glu Phe Tyr Phe Ala Gin 

420 425 430 

Lys He Asp Tyr Leu Lys Asp Lys lie Gin Pro Ser Phe Val Lys Glu 
15 435 440 445 

Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys Val Arg He Asn 
450 455 460 

20 Ala Leu Val Ala Lys Ala Gin Lys Val Pro Glu Glu Gly Trp Thr Met 
465 470 475 480 

Ala Asp Gly Thr Ala Trp Pro Gly Asn Asn Pro Arg Asp His Pro Gly 

485 490 495 

25 

Met He Gin Val Phe Leu Gly His Ser Gly Gly Leu Asp Thr Asp Gly 

500 505 510 

Asn Glu Leu Pro Arg Leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly 
30 515 520 525 

Phe Gin His His Lys Lys Ala Gly Ala Met Asn Ala Leu He Arg Val 
530 535 540 

35 Ser Ala Val 
545 



40 
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CLAIMS: 

1 . An isolated nucleic acid molecule which encodes a polypeptide of the cellulose 
biosynthetic pathway or a homologue, analogue or derivative thereof or a complementary 
sequence thereto, wherein said polypeptide is capable of producing cellulose and/or P-1,4- 

5 glucan and/or an intermediate between cellulose and a P~l,4-glucan polymer. 

2. The isolated nucleic acid molecule according to claim 1 wherein the polypeptide is 
cellulose synthase or a catalytic subunit thereof. 

10 3. The isolated nucleic acid molecule according to claim 1 or 2, derived from a 
prokaryote. 

4. The isolated nucleic acid molecule according to claim 3, wherein the prokaryote is 
a bacterium other than Agrobacterium tumefaciens. Acetobacter pasteurianus or Acetobacter 

15 xylinum. 

5. The isolated nucleic acid molecule according to claim 1 or 2, derived from a 
eukaryote. 

20 6. The isolated nucleic acid molecule according to claim 5, wherein the eukaryote is a 
plant or fungus. 

7. The isolated nucleic acid molecule according to claim 6, wherein the plant is selected 
from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa 

25 (rice), wheat, barley, maize, Brassica ssp., Eucalyptus ssp., hemp t jute, flax, Pinus ssp., 
Populus ssp., and Picea spp., amongst others. 

8. The isolated nucleic acid molecule according to claim 2 wherein the cellulose synthase 
or catalytic subunit thereof is the Arabidopsis thaliana RSWl polypeptide. 

30 
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9. The isolated nucleic acid molecule according to any one of claims 1 to 8. comprising 
a sequence of nucleotides which is at least 40% identical to any one of SEQ ID NOs:l. 3, 
4. 5. 7, 9, 11 or 13 or a complementary sequence thereof. 

10. The isolated nucleic acid molecule according to claim 9, wherein the percentage 
identity to any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence 
thereof is at least 60%. 

10 11. The isolated nucleic acid molecule according to claim 9, wherein the percentage 
identity to any one of SEQ ID NOs:L 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence 
thereof is at least 80%. 

12. An isolated nucleic acid molecule which comprises a sequence of nucleotides 
15 substantially as set forth in any one of SEQ ID NOs:3, 4, 5, 7, 9 or 11 or a homologue, 

analogue or derivative thereof or a complementary sequence thereto. 

13. The isolated nucleic acid molecule according to any one of claims 1 to 12 , wherein 
said nucleic acid molecule hybridizes under at least low stringency conditions to at least 20 

20 contiguous nucleotides of any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a 
complementary sequence thereto. 

14. An isolated nucleic acid molecule which encodes a polypeptide which is capable of 
cellulose and/or p-1 ,4- glucan biosynthesis in a plant cell, fungal cell, insect cell, animal cell, 

25 yeast cell or bacterial cell when expressed therein. 

15. The isolated nucleic acid molecule according to claim 14, wherein the polypeptide is 
cellulose synthase or a catalytic subunit thereof. 

30 16. The isolated nucleic acid molecule according to claim 14 or 15, derived from a 
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prokaryote. 

17. The isolated nucleic acid molecule according to claim 16, wherein the prokaryote is 
a bacterium other than Agrobacterium tumefaciens , Acetobacter pasteurionus or Aceiobacter 

5 xylinum. 

18. The isolated nucleic acid molecule according to claim 14 or 15, derived from a 
eukaryote. 

10 19. The isolated nucleic acid molecule according to claim 18, wherein the eukaryote is 
a plant or fungus. 

20. The isolated nucleic acid molecule according to claim 19, wherein the plant is selected 
from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa 

15 (rice), wheat, barley, maize, Brassica ssp., Eucalyptus ssp. t hemp, jute, flax, Pinus ssp., 
Populus ssp.. and Picea spp., amongst others. 

21. The isolated nucleic acid molecule according to claim 20, wherein the cellulose 
synthase or catalytic subunit thereof is the Arabidopsis thaliana RSW1 polypeptide. 

20 

22. The isolated nucleic acid molecule according to any one of claims 14 to 21, 
comprising a sequence of nucleotides which is at least 40% identical to any one of SEQ ID 
NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence thereto. 

25 23. The isolated nucleic acid molecule according to claim 22, wherein the percentage 
identity to any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence 
thereof is at least 60%. 

24. The isolated nucleic acid molecule according to claim 22, wherein the percentage 
30 identity to any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence 



WO 98/00549 PCT/AU97/00402 

- 173 - 

thereof is at least 80%. 

25. The isolated nucleic acid molecule according to claim 22 , comprising the sequence 
of nucleotides substantially as set forth in any one of SEQ ID NOs:3, 4, 5, 7, 9 or 11 or a 

5 homologue, analogue or derivative thereof or a complementary sequence thereto. 

26. An isolated nucleic acid molecule which encodes or is complementary to a nucleic 
acid molecule which encodes a polypeptide capable of cellulose and/or p-l,4~glucan 
biosynthesis wherein said polypeptide comprises a sequence of amino acids which is at least 

10 40% identical to any one of SEQ ID Nos:2, 6, 8, 10, 12 or 14. 

27. The isolated nucleic acid molecule according to claim 26, wherein the percentage 
identity to any one of SEQ ID Nos:2, 6, 8, 10, 12 or 14 is at least 60% . 

15 28. The isolated nucleic acid molecule according to claim 27, wherein the percentage 
identity to any one of SEQ ID Nos:2, 6, 8, 10, 12 or 14 is at least 80%. 

29. The isolated nucleic acid molecule according to claim 26, wherein the polypeptide 
comprises a sequence of amino acids substantially as set forth in any one of SEQ ID Nos:2, 

20 6, 8, 10, 12 or 14. 

30. A genetic construct which comprises the isolated nucleic acid molecule according to 
any one of claims 1 to 29. 

25 31 . A genetic construct which comprises the isolated nucleic acid molecule according to 
any one of claims 1 to 29 operably connected to a promoter sequence. 

32. The genetic construct according to claim 31, wherein the nucleic acid molecule is 
operably connected to the promoter sequence in the sense orientation such that RNA which 
30 encodes a polypeptide capable of cellulose and/or P-l,4-glucan biosynthesis or a homologue, 
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analogue or derivative thereof is produced when said nucleic acid molecule is expressed. 

33. The genetic construct according to claim 31, wherein the nucleic acid molecule is 
operably connected to the promoter sequence in the antisense orientation such that RNA 

5 which is complementary to RNA which encodes a polypeptide capable of cellulose and/or P- 
1,4-glucan biosynthesis or a homologue, analogue or derivative thereof, is produced when 
said nucleic acid molecule is expressed. 

34. The genetic construct according to claim 33, wherein the nucleic acid molecule 
10 encodes an antisense or ribozyme molecule. 

35. The genetic construct according to any one of claims 31 to 34, wherein the promoter 
is the CaMV 35S promoter. 

15 36. The genetic construct according to any one of claims 31 to 34, wherein the promoter 
is the Arabidopsis thaliana RSWl gene promoter. 

37. A method of increasing the level of cellulose in a cell, tissue, organ or organism, said 
method comprising expressing the isolated nucleic acid molecule according to any one of 

20 claims 1 to 29 therein, in the sense orientation, for a time and under conditions at least 
sufficient to produce or increase expression of the polypeptide encoded therefor. 

38. The method according to claim 37, comprising the additional first step of 
transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 

25 

39. The method according to claim 38, wherein the cell is a prokaryotic cell. 

40. The method according to claim 38, wherein the cell, tissue, organ or organism is a 
eukaryotic cell, tissue, organ or organism. 

30 
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41. The method according to claim 40, wherein the cell, tissue, organ or organism is a 
plant, fungal, insect, animal or yeast cell, tissue, organ or organism. 

42. The method according to claim 41, wherein the cell, tissue, organ or organism is a 
5 plant cell, tissue, organ or organism. 

43. The method according to claim 42 wherein the plant is selected from the list 
comprising Arabidopsis thaliana, Gossypium hirsutwn (cotton), Oryza sativa (rice), 
Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants 

10 such as Pinus ssp., Populus ssp. t Picea spp., amongst others. 

44. A method of reducing the level of non-crystalline P-l,4-glucan in a cell, tissue, organ 
or organism, said method comprising expressing the isolated nucleic acid molecule according 
to any one of claims 1 to 29 therein, in the sense orientation, for a time and under conditions 

15 at least sufficient to produce or increase expression of the polypeptide encoded therefor. 



45. The method according to claim 44, comprising the additional first step of 
transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 

20 

46. The method according to claim 44, wherein the cell is a prokaryotic cell. 

47. The method according to claim 44, wherein the cell, tissue, organ or organism is a 
eukaryotic cell, tissue, organ or organism. 

25 

48. The method according to claim 47, wherein the cell, tissue, organ or organism is a 
plant, fungal, insect, animal or yeast cell, tissue, organ or organism. 

49. The method according to claim 48, wherein the cell, tissue, organ or organism is a 
30 plant cell, tissue, organ or organism. 
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50. The method according to claim 50 wherein the plant is selected from the list 
comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza saliva (rice), 
Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants 
such as Pinus ssp., Populus ssp., Picea spp., amongst others. 

5 

51. A method of reducing the level of starch in a cell, tissue, organ or organism, said 
method comprising expressing the isolated nucleic acid molecule according to any one of 
claims 1 to 29 therein, in the sense orientation, for a time and under conditions at least 
sufficient to produce or increase expression of the polypeptide encoded therefor. 

10 

52. The method according to claim 50, comprising the additional first step of 
transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 

53. The method according to claim 51, wherein the cell is a prokaryotic cell. 

15 

54. The method according to claim 53, wherein the cell, tissue, organ or organism is a 
eukaryotic cell, tissue, organ or organism. 

55. The method according to claim 54, wherein the eukaryote is a plant, fungus, insect, 
20 animal or yeast. 

56. The method according to claim 55, wherein the eukaryote is a plant. 

57. The method according to claim 56 wherein the plant is selected from the list 
25 comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), 

Eucalyptus ssp. t Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants 
such as Pinus ssp., Populus ssp., Picea spp., amongst others. 

58. A method of reducing the level of cellulose in a cell, tissue, organ or organism, said 
30 method comprising expressing the isolated nucleic acid molecule according to any one of 
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claims 1 to 29 therein, in the antisense orientation, for a time and under conditions at least 
sufficient to prevent or reduce the expression of the polypeptide encoded therefor. 

59. The method according to claim 58, comprising the additional first step of 
5 transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 

60. The method according to claims 58 or 59, wherein the cell, tissue, organ or organism 
is a eukaryotic cell, tissue, organ or organism. 

10 61. The method according to claim 60, wherein the eukaryote is a plant, fungus, insect, 
animal or yeast. 

62. The method according to claim 61, wherein the eukaryote is a plant. 

15 63. The method according to claim 62 wherein the plant is selected from the list 
comprising Arabidopsis thaliana, Gossypium hirsuium (cotton), Oryza sativa (rice). 
Eucalyptus ssp., Brassica ssp. f wheat, barley, maize, hemp, jute, flax, and woody plants 
such as Pinus ssp,, Populus ssp., Picea spp., amongst others. 

20 64. A method of increasing the level of non-crystalline p-l,4-glucan in a cell, tissue, 
organ or organism, said method comprising expressing the isolated nucleic acid molecule 
according to any one of claims 1 to 29 therein, in the antisense orientation, for a time and 
under conditions at least sufficient to prevent or reduce the expression of the polypeptide 
encoded therefor. 

25 

65. The method according to claim 64, comprising the additional first step of 
transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 

66. The method according to claims 64 or 65, wherein the cell, tissue, organ or organism 
30 is a eukaryotic cell, tissue, organ or organism. 
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67. The method according to claim 66, wherein the eukaryote is a plant, fungus, insect, 
animal or yeast. 

68. The method according to claim 67, wherein the eukaryote is a plant. 

5 

69. The method according to claim 68 wherein the plant is selected from the list 
comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice). 
Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants 
such as Pinus ssp., Populus ssp., Picea spp. t amongst others. 

10 

70. A method of increasing the level of starch in a cell, tissue, organ or organism, said 
method comprising expressing the isolated nucleic acid molecule according to any one of 
claims 1 to 29 therein, in the antisense orientation, for a time and under conditions at least 
sufficient to prevent or reduce the expression of the polypeptide encoded therefor. 

15 

71. The method according to claim 70, comprising the additional first step of 
transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 

72. The method according to claims 70 or 71 , wherein the cell, tissue, organ or organism 
20 is a eukaryotic cell, tissue, organ or organism. 

73. The method according to claim 72, wherein the eukaryote is a plant, fungus, insect, 
animal or yeast. 

25 74. The method according to claim 73, wherein the eukaryote is a plant. 

75. The method according to claim 74 wherein the plant is selected from the list 
comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), 
Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants 
30 such as Pinus ssp. t Populus ssp., Picea spp., amongst others. 
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76. A method of producing a recombinant enzymatically active polypeptide which is 
capable of synthesizing cellulose and/or P-l,4-glucan and/or an intermediate between 
cellulose and p-l,4-glucan in a cell, said method comprising expressing the isolated nucleic 
acid molecule according to any one of claims 1 to 29 or a homologue, analogue or derivative 

5 thereof in said cell for a time and under conditions sufficient for the polypeptide encoded 
therefor to be produced. 

77. The method according to claim 76, comprising the additional first step of 
transforming the cell with the isolated nucleic acid molecule according to any one of claims 

10 1 to 29 or the genetic construct according to any one of claims 11 to 15. 

78. A recombinant polypeptide produced according to the method defined by claim 76 or 
77. 

15 79. The recombinant cellulose biosynthetic polypeptide according to claim 78, further 
defined as a recombinant cellulose synthase or catalytically active subunit thereof. 

80. A recombinant cellulose biosynthetic polypeptide capable of cellulose and/or p-1 ,4- 
glucan production and comprising a sequence of amino acids set forth in any one of SEQ ID 

20 Nos: 2, 6, 8, 10, 12 or 14 or a homologue, analogue or derivative thereof which is at least 
40% identical thereto, 

81. The recombinant cellulose biosynthetic polypeptide according to claim 80, wherein 
the percentage identity to any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14 is at least 60%. 

25 

82. The recombinant cellulose biosynthetic polypeptide according to claim 81, wherein 
the percentage identity to any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14 is at least 80% . 

83. The recombinant cellulose biosynthetic polypeptide according to claim 82, comprising 
30 a sequence of amino acids substantially as set forth in any one of SEQ ID Nos: 2, 6, 8, 10, 
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12 or 14. 

84. A method of altering the mechanical properties of a cell wall, said method comprising 
expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 in the 

5 antisense orientation in said cell for a time and under conditions sufficient for the level of 
non-crystalline P-l,4-glucan to increase in said cell. 

85. The method according to claim 84 wherein the non-crystalline P-l,4-glucan is cross- 
linked to cellulose microfibrils. 

10 

86. The method according to claim 84 or 85 wherein the cell wall normally has a high 
ratio of cellulose to hemicelluloses. 

87. The method according to any one of claims 84 to 86, wherein the nucleic acid 
1 5 molecule expressed in the antisense orientation is contained within an antisense molecule or 

ribozyme molecule. 

88. The method according to any one of claims 84 to 87, wherein the cell wall is a plant 
cell wall. 

20 

89. The method according to claim 88, wherein the plant is selected from the list 
comprising Arabidopsis thaliana. Gossypium hirsutum (cotton), Oryza sativa (rice), 
Eucalyptus ssp. t Brassica ssp. t wheat, barley, maize, hemp, jute, flax, and woody plants 
such as Pinus ssp. f Populus ssp., Picea spp., amongst others. 

25 

90. An antibody molecule which binds to the recombinant polypeptide according to any 
one of claims 78 to 83 or a homologue, analogue or derivative thereof. 

91. A transgenic plant transformed with the isolated nucleic acid molecule according to 
30 any one of claims 1 to 29 or a genetic construct according to any one of claims 30 to 36. 



WO 98/00549 PCT/AU97/00402 



- 181 - 



92. The transgenic plant according to claim 91, wherein said plant is selected from the 
list comprising Arabidopsis thaliana. Gossypium hirsutum (cotton), Oryza sativa (rice), 
Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants 
such as Pinus ssp., Populus ssp. t Picea spp., amongst others. 

5 

93. Use of an isolated nucleic acid molecule according to any one of claims 1 to 29 to 
modify the cellulose content of a cell. 

94. Use according to claim 93, wherein if the nucleic acid molecule according to any one 
10 of claims 1 to 29 is expressed in the sense orientation in said cell, the level of cellulose 

therein is increased. 

95. Use according to claim 93, wherein if the nucleic acid molecule according to any one 
of claims 1 to 29 is expressed in the antisense orientation in said cell, the level of cellulose 

15 therein is decreased. 

96. Use according to claim 95, wherein said cell is further characterised by increased 
non-crystalline p-l,4-glucan content and/or starch content. 

20 97. Use according to claim 95 or 96, wherein said cell is further characterised by 
increased cross-linking of non-crystalline p-l,4-glucan to cellulose. 

98. Use according to any one of claims 93 to 97, wherein the cell is a plant cell. 

25 99. Use according to claim 98 wherein the plant is selected from the list comprising 
Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., 
Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., 
Populus ssp., Picea spp., amongst others. 
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